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Preface to the Third Edition 


In the present edition we have made changes in Chapter 1, mainly as a result of comments 
by Professor A. S. Besicovitch. Some theorems are stated more explicitly, a few proofs are 
added, and some are shortened. We are indebted to him for an elementary proof of the 
theorem of bounded convergence for Riemann integrals, which appears in the notes. In 
Chapter 6 the proof of Poisson’s equation has been improved. In Chapter 17 we have 
discussed the Airy integral for complex argument in more detail, and have given con¬ 
ditions for uniformity of approximation for asymptotic solutions of Green’s type for 
complex argument. In Chapter 23 we have added some remarks on the analytic continua¬ 
tion of the solutions, and a note applies them to the parabolic cylinder functions. 

We should like to express our thanks to several readers for drawing our attention to 

errors and misprints. HAROLD JEFFREYS 

BERTHA JEFFREYS 

April 1953 

Preface to the Second Edition 


As a second edition of this book has been called for, we have taken the opportunity of 
making considerable revisions. Most of the notes at the end have been incorporated in 
the text. Otherwise the principal changes are as follows. In Chapter 1, the Heine-Borel 
theorem and Goursat’s modification have been placed early, and used to derive several 
theorems that had been proved by separate applications of methods that could be used 
to prove the general theorems. In other respects, notably the theory of the Riemann 
integral, the theory has been given more fully. In Chapter 4 an account of block 
matrices has been added, and the theorem on characteristic solutions of commuting 
matrices has been more fully discussed. Chapter 5 (multiple integrals) has been almost 
completely rewritten, and now includes an account of the theory of functions of several 
variables, part of which was given in Chapter 11. In Chapter 9 the treatment of re¬ 
laxation methods has been extended, and should now serve as an adequate introduction 
to the special works on the subject. Many improvements have been made in Chapters 11 
and 12, including an important correction to the proof of Cauchy’s theorem, a proof of 
the Osgood-Vitali theorem, and a complete revision of the theory of inverse functions. 
In Chapter 17 the conditions for the truth of Watson’s lemma have been somewhat 
relaxed, so that they are now wide enough to cover almost all physical applications, 
and the method of stationary phase is more fully treated. In Chapter 24 the treatment 
of multipole radiation has been extended. 

Where possible the proofs have been either replaced by shorter ones or generalized. 
Some new examples have been added. 

We are indebted to numerous correspondents for pointing out errata. The two most 
serious corrections were given by Professor J. E. Littlewood and Dr M. L. Cartwright. 
We are particularly grateful for comments by Professor Littlewood (Chapters 1,5, 11 and 
12), Mr P. Hall (Chapter 4), Professor A. S. Besicovitch and Dr J. C. Burkill (Chapter 5). 

HAROLD JEFFREYS 

BERTHA JEFFREYS 

15 November 1948 


Preface to the First Edition 


This book is intended to provide an account of those parts of pure mathematics that are 
most frequently needed in physics. The choice of subject-matter has been rather difficult. 

A book containing all methods used in different branches of physics would be impossibly 
long. We have generally included a method if it has applications in at least two branches, 
though we do not claim to have followed the rule invariably. Abundant applications to — 
special problems are given as illustrations. We think that many students whose interests 
are mainly in applications have difficulty in following abstract arguments, not on account 
of incapacity, but because they need to ‘see the point’ before their interest can be 
aroused. 

A knowledge of calculus is assumed. Some explanation of the standard of rigour and 
generality aimed at is desirable. We do not accept the common view that any argument 
is good enough if it is intended to be used by scientists. We hold that it is as necessary 
to science as to pure mathematics that the fundamental principles should be clearly 
stated and that the conclusions shall follow from them. But in science it is also necessary 
that the principles taken as fundamental should be as closely related to observation as 
possible; it matters little to pure mathematics what is taken as fundamental, but it is of 
primary importance to science. We maintain therefore that careful analysis is more 
important in science than in pure mathematics, not less. We have also found repeatedly 
that the easiest way to make a statement reasonably plausible is to give a rigorous proof. 
Some of the most important results (e.g. Cauchy’s theorem) are so surprising at first 
sight that nothing short of a proof can make them credible. On the other hand, a pure 
mathematician is usually dissatisfied with a theorem until it has been stated in its most 
general form. The scientific applications are often limited to a few special types. We have 
therefore often given proofs under what a pure mathematician will considor unneces¬ 
sarily restrictive conditions, but these are satisfied in most applications. Generality is 
a good thing, but it can be purchased at too high a price. Sometimes, if the conditions 
we adopt are not satisfied in a particular problem, the method of extending the theorem 
will be obvious; but it is sometimes very difficult, and we have not thought it worth 
while to make elaborate provision against cases that are seldom met. For some exten¬ 
sive subjects, which are important but need long discussion and are well treated in some 
standard, book, we have thought it sufficient to give references. 

We consider it especially important that scientists should have reasonably accessible 
statements of conditions for the truth of the theorems that they use. One often sees a 
statement that some result has been rigorously proved, unaccompanied by any verifica¬ 
tion that the conditions postulated in the proof are satisfied in the actual problem—and 
very often they are not. This misuse of mathematics is to be found in most branches of 
science. On the other hand, many results are usually proved under conditions that are 
sufficient but not necessary, and scientists often hesitate to use them, under the mistaken 
belief that they are necessary. We have therefore often given proofs under more general 
conditions than are usually taught to scientists, where the usual sufficient conditions 
are often not satisfied in practice but less stringent ones are satisfied. Both troubles are 
due chiefly to the fact that the theorems are scattered through many books and papers, 
and the scientist does not know what to look for or where to look. 
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The book can be read consecutively, but some parts are independent of much that 
precedes them, and it is possible, and indeed desirable, to study different chapters con¬ 
currently. In some cases we have given special cases of a theorem before the general 
form where the latter involves more elaborate treatment, especially where the student 
is likely to meet applications to several instances of the special cases before he needs 
the general theorem. 

We hesitated before including a chapter on the theory of functions of a real variable. 
This is far from a complete treatment, but fuller works are mostly longer than the 
theoretical physicist has time to read; and unfortunately they sometimes relegate 
theorems that are frequently needed to small type or unworked examples, or omit them 
altogether. We have aimed at giving accounts of the principal methods of the theory 
but not at proving every result in detail; but we think' that students will benefit by 
filling in some of the details for themselves. If a student has difficulty in achieving the 
degree of abstraction needed in most of this chapter, we advise him to read as much as 
he can stand and then proceed to a later chapter, referring back when necessary. He 
will find that he has covered the whole of it before finishing Chapter- 14, and that he 
knows both what is there and why it is there. We have not succeeded in avoiding forward 
references altogether, but the most serious, the proof in Chapter 12 of the theorem that 
an algebraic equation of degree n has n roots, used in Chapter 4, is so time-honoured 
that a few smaller transgressions may, we hope, be forgiven. 

The notation of special functions has grown up haphazard, and is inconvenient in 
several respects. Quantum theorists are making wholesale changes of definition to 
ensure normalization, but we consider that this replaces the old complications by new 
ones. We have modified the usual definitions of the Legendre functions, with the result 
that a more symmetrical treatment becomes possible and the relation to Bessel functions 
becomes free from complicated numerical factors. We have returned to Heaviside’s 
definition of the function K n but denoted it by Kh n . Among other advantages, this 
simplifies the relation to Legendre functions of the second type. We have also dropped 
the r notation for the factorial function, which seems to have no recommendations 
whatever. 

The immediate stimulus for the book was the announcement that the second edition 
of Operational Methods in Mathematical Physics by one of us was out of print. Most of 
this tract has been incorporated and later developments have been added. The chapter 
on dispersion was somewhat out of place in the tract, as it was largely independent of 
the operational method, but was included because the notion of group velocity had not 
previously been discussed in relation to the method of steepest descents. It now finds 
a more natural place in a chapter on asymptotic expansions, in which some methods 
widely used but hitherto accessible only in scattered papers are also described. Most 
of Cartesian Tensors has also been incorporated. The applications of thermodynamics 
in it to hydrodynamics and elasticity would be more suitably treated in textbooks of 
the latter subjects. 

We have not tried to give a detailed account of any branch of physics; that is a matter 
for the special text-books. 

We are deeply indebted to many friends for their encouragement during the writing 
of this book. Above all we must thank Dr F. Smithies, who placed his great knowledge 
freely at our disposal, and generously helped in the proof reading. His suggestions have 
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IX 


been invaluable. It is only fair to him to say that in some places we have persisted in 
our ways in spite of his vigorous protests. Dr J. C. P. Miller gave us special help with 
Chapters 9 and 23, and Mr H. Bondi with Chapter 24. We have also had valuable 
suggestions at various points from Professors M. H. A. Newman, A. C. Offord, 
L. Rosenhead and H. W. Turnbull, and from Mr A. S. Besicovitch, Miss M. L. Cartwright 
and Mr D. P. Dalzell. 

We also thank the Universities of Cambridge, London and Manchester for permission 
to use examination questions as examples, and the staff of the Cambridge University 
Press for their care in the printing and their readiness to meet the wishes of a rather 
exacting pair of authors. 


HAROLD JEFFREYS 
BERTHA JEFFREYS 
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The main sections of each chapter are numbered decimally at intervals of 0-01; 
subsections are indicated by further decimals. When the argument of a section or 
subsection continues that of the previous one, the numbering of the equations also 
continues. 

Notes at the end are numbered according to the subsection referred to; references to 
them are indicated by a small index letter in heavy type in the text; for instance, the a 
on p. 52, in subsection IT34, refers to note IT34a, which will be found on p. 692. 

Sources of examples are indicated by the following abbreviations: 


M. T. 

M. T.,'Sched. B. 
Prelim. 


Mathematical Tripos, Part II and Schedule A. 
Mathematical Tripos, Part III and Schedule B. 
Preliminary Examination in Mathematics. 
Manchester, Final Honours in Mathematics. 
Imperial College, London. 


M/c, III. 
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Chapter 1 

THE BEAL VARIABLE 

'In dem days dey wuz monstus fon’ er minners.' 

jobl chandler HARRIS, Unde, Remus 

1*01. The relation of mathematics to physics. The simplest mathematical 
notion is that of the number of a class. This is the property common to the class and to any 
class that can be matched with it by pairing off the members, one from each class, so that 
all members of each class are paired off and none left over. In terms of the definition we 
can give meanings to the fundamental operations of addition and multiplication. Con¬ 
sider two classes with numbers a , 6 and no common member. The sum of a and b is the 
number of the class consisting of all members of the two classes taken together. The 
product of a and b is the number of all possible pairs taken one from each class. We cannot 
always give meanings to subtraction and division, because, for instance, we cannot find 
a class whose number is 2 - 3 or 7/5. But it is found to be a great convenience to extend the 
notion of number so as to include negative numbers, ratios of numbers irrespective of 
whether they are positive or negative, and even irrational numbers. When this is done 
we can define all the four fundamental operations of arithmetic, and the result of carrying 
them out will always be a number within the system. We need trouble no more about 
whether an operation is possible with a particular set of numbers, since we know that it is, 
once we have given sufficient generality to what we mean by a number. So long as we 
keep to the fundamental operations we can use algebra; that is, we can prove formulae 
that will be correct when any numbers whatever are substituted for the symbols in them, 
with only one exception, namely, that we must not divide by 0. 

Now the formulae may still be correct when we replace the letters in them by something 
other than numbers, and it is to this fact that the possibility of mathematical physics is due. 
It is therefore useful to know just what conditions have to be satisfied if we are to take 
over the rules of algebra into any subject that does not deal entirely with numbers. We 
may then have to find new meanings for the fundamental operations (or have them found 
for us) and for the sign =, but can still manipulate the symbols with their new meanings 
in the old way. A suitable set of conditions is as follows.* We say that they are to hold 
in a field F consisting of all elements of the system considered: 

(1) For any a, b of F, a + b and ab are uniquely determined elements of F. 

(2) b + a = a + b. (Commutative law of addition.) 

(3) (a + b) + c = a+(b + c). (Associative law of addition.) 

(4) ba = ab. (Commutative law of multiplication.) 

(5) a(bc ) = (ab) c. (Associative law of multiplication.) 

(6) a(b + c) = ab + ac. (Distributive law.) 

(7) There are two elements 0 and 1 in F, such that a + 0 = a, ol — a. 

(8) For any element a of F there is an element x of F such that a + x = 0. 

(9) For every element o of F, other than 0, there is an element y of F such that ay — 1. 

* Stated first by Dedekind for the case where + and x have their ordinary arithmetic me anings ; 
in general by H. Weber. 
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2 Mathematics as a language 

It is to be noticed that the first seven rules are true if F consists only of the positive 
integers and 0, but the last two are false of that F, since there is no positive or zero 
integer x that makes a + x = 0 if a = 1, and there is no positive or zero integer y that 
makes ay = 1 if a = 2. The eighth rule introduces negative numbers and hence sub¬ 
traction. The ninth introduces reciprocals and hence division and rational fractions. 
The rules are true if F consists of all rational numbers, positive or negative. 

The rules mention no ordering relation: that is, they suppose a meaning attached to 
equality and therefore to 4=, but do not distinguish between greater and less. We could 
agree to arrange the numbers in any order, keeping the same correspondences between 
them according to (1), (7), (8), (9), and the rules would still be true. Algebra and pure 
geometry can get on to some extent without such a distinction, but higher mathematics 
cannot, nor can any kind of physics. A measurement is not a statement of exact equality 
but of equality within a certain range of error. We therefore need new rules concerning 
inequalities. 

(10) For any a, b of F, either a > b, a = b, or b > a. (Law of comparability.) 

(11) For given a, b of F, only one of a > b, a = b, b > a can be true. (Trichotomy.) 

(12) If a > b and b > c, then a>c. (Transitive property.) 

(13) If a > b, then a + c > b + c for any c. (Additivity of ordering.) 

(14) If a > b, c > 0, then ac > be. (Multiplicativity of ordering.) 

(15) If a > b, b < a. (Definition of <.) 

The use of mathematics in science is that of a language, in which we can state relations 
too complicated to be described, except at inordinate length, in ordinary language. The 
rules satisfied by the symbols are the grammar of the language. This point of view has 
been developed greatly in recent years, especially by R. Carnap. But for a language to 
be suitable it must satisfy two conditions. It must be possible to say in it the things that 
we need to say; that is, it must have sufficient generality. It must also be self-consistent; 
that is, starting from the rules themselves it must be impossible to deduce something 
declared to be false by those rules. It would, for instance, be fatal to the scientific useful¬ 
ness of mathematics if it was possible to prove by it that for some a and b, a is both greater 
and less than 6. It was always taken for granted until the later nineteenth century that 
mathematics was consistent. But then an unexpected set of difficulties cropped up, and 
showed that a complete analysis of the foundations was necessary. The great Principia 
Mathematica of Whitehead and Russell showed that all the propositions asserted in 
mathematics concerning real numbers (not only ratios of integers, positive or negative) 
could be restated as propositions about the elementary notion of comparing classes by 
pairing their members, and demonstrable from the axioms of such comparison and others 
relating to pure logic. Later workers have modified some of the latter axioms, and the 
best choice of axioms is still a matter of discussion. Godel and Carnap, more recently, 
have shown that the proposition that a given system of axioms for mathematics is con¬ 
sistent cannot be proved by methods using only the rules of the system. But it is found 
impossible to prove certain propositions that could be proved if the system was inconsis¬ 
tent. We have to come back to something like ordinary language after all when we want 
to talk about mathematics! This work on the boundary between logic and what we usually 
consider the elements of mathematics has a considerable modem literature, and it is well 
for physicists to know of its existence, though its detailed study is a matter for specialists. 


1*02 Physical magnitudes 3 

1*02. Physical magnitudes. Generality requires that, in any particular field, the 
language shall contain symbols for the things that we need to talk about and for the 
processes that we carry out. A shepherd would be severely handicapped if he had to do 
his best with a language containing no words for sheep and shearing; in fact he would 
make such words, and that is what we habitually do in science. So long as the language 
is consistent it is none the worse for containing a lot of words that we do not use. A pure 
mathematician, working entirely on the theory of numbers, can use ordinary algebra 
freely in spite of the fact that he may not need to use negative numbers or fractions. For 
him rules (8) and (9) are just an unnecessary generality. Now in physics the fundamental 
notion of measurement corresponds closely to that of addition, and most physical laws 
are statements of proportionality, which corresponds to the notions of multiplication and 
division. This is the ultimate reason why mathematics is useful. Thus, for instance, we 
can say that if two bars are placed end to end to make one straight bar, the length of the 
combined bar is the sum of those of the original ones. This is not a theorem or an experi¬ 
mental fact; it is the definition of addition for lengths. Further, it is irrelevant which is 
taken first; thus the commutative law of addition holds. Again, if we unite three bars, the 
total length is independent of the order; hence the associative law of addition also holds. 
These are experimental facts established by actual comparison with other bars. These 
rules are enough to justify the use of scales of measurement for length, by which any 
length is compared with a standard one by means of a scale, every interval of which has 
been compared with a standard object in the process of manufacture. Quantities measur¬ 
able by some process of physical addition have been called fundamental magnitudes by 
N. R. Campbell.* The most widely important ones are numbers (of classes), length, time, 
and mass, but physical processes of addition can also be stated for area and volume, for 
electric charge, potential, and current, and many other quantities. 

There is a divergence of practice among physicists at the next stage. A statement that 
a distance is 3-7 cm. contains a number and a unit. It is often thought that algebra applies 
only to numbers and therefore that in the mathematical treatment the symbol used for 
the distance refers only to the 3-7 and not to the centimetres. The unit matters, otherwise 
we should find ourselves saying that 10 mm. expresses a different length from 1 cm. and 
that 1 cm. is the same as 1 mile; and this is contrary to physics because the only justifica¬ 
tion of using measurement at all is in the direct physical comparison by superposition. 
We avoid this difficulty if we say that the symbol for the length refers to the length itself 
and not simply to the number contained in its measure. ‘ 1 inch = 2*54 cm.’ is a useful 
statement; either symbol, ‘1 inch’ or ‘2-54 cm.’, denotes the same length. In general 
theorems this procedure can always be followed. When a particular application to a 
measured system is made we naturally give the symbols their actual values in terms of 
the measures, which will include a statement of the units; but in the general theory the 
unit is irrelevant. The symbols will then be said to stand, not for numbers, but for physical 
magnitudes. 

The alternative method would be to let the symbols stand for the numbers, but then 
confusion can occur, and does, between the relations between measures of the same system 
in different units, which are different ways of saying the same thing, and of different 
systems in the same units, which say different things. If, however, the numerical values 
in terms of special units are used for a and binab, their product will be the number in the 

* ‘Elementary’ might be better. 
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expression of ab in what is usually called the consistent unit for ab. The word germane, 
introduced by E. A. Guggenheim, is better because it is not inconsistent to measure distances 
upward in feet, horizontally in yards, and downward in fathoms; it is merely a nuisance. 
With adequate care this method can be used correctly, but it has several disadvantages; 
in particular it then leads to placing too much emphasis on the units and too little on the 
fundamental physical comparisons without which the units would be useless. It also 
suggests many comparisons that are physically meaningless, as we shall see in a moment. 

If we use the notion of magnitude and retain the processes of algebra the question will 
at once arise, what do we mean by a = b and a + b if a is a length and b a time or a mass? 
A meaning could be attached to a + b, though it would be very artificial, but no physical 
process will give one to a = 6. But a/6 would have a meaning, being respectively a velocity 
or a length per unit mass. 

The group of rules (10)-(14) therefore needs modification. Those up to (9) could stand, 
though they bring in many additions and subtractions and possibly some multiplications 
and divisions that we shall never have occasion to use; but in addition to the three possi¬ 
bilities enumerated in (10) we must admit a fourth, that a and 6 may not be comparable 
and therefore belong to different fields, and their product and ratio may belong to other 
fields again. This is a further disadvantage of the use of symbols to denote only the 
number stated in a measure, since all numbers are comparable, and the language would not 
exhibit the fact that it is meaningless to say that a time is greater than a density. We can 
then say also that if a and 6 are not comparable, a + b is not a physical magnitude and 
addition does not arise. The whole field of physical magnitudes is thus divided into plots. 
Magnitudes in the same plot will be comparable, but their product will belong to a 
different plot unless at least one of them is a number. 

The language needed for physics is therefore not quite the same as ordinary algebra. 
Since the latter is self-consistent and the statement that some magnitudes are not com¬ 
parable cuts out some propositions from it and adds no new ones, the language of magni¬ 
tude is also self-consistent. It will be seen that the modification corresponds to the notion 
of dimensions. Quantities of different dimensions are not comparable; also some quantities 
of the same dimensions are not. For instance, according to one pair of definitions in use, 
electric charge and magnetic pole strength have the same dimensions, and they are both 
fundamental magnitudes, but it is meaningless to add them. The field of physical magni¬ 
tudes can be taken to satisfy the laws of algebra, but is classified; comparable quantities 
satisfy (10), and are capable of addition at least in calculation; incomparable ones do not. 
It should be noticed that failure of addition by a physical process is not confined to in¬ 
comparable magnitudes. For instance, there is no process of combining two substances 
of density 1 g./cm. 3 to give one of density 2 g./cm. 3 Density is not measured directly but 
calculated from the additive magnitudes mass and length, and is called a derived magni¬ 
tude. Some quantities can be both additive and derived; thus electric current measured 
by its magnetic effect is a fundamental magnitude, but regarded as the charge passing 
per unit time it is derived. Many derived magnitudes are ratios of two magnitudes of the 
same dim ensions; thus we could regard the shape of a triangle as specified by two ratios, 
those of two sides to the third. These ratios are pure numbers and the rules of algebra 
can be applied to them without change.* 

* A similar treatment was advocated by W. Stroud; for discussion and applications to teaching, 
cf. Sir J. B. Henderson, Engineering, 116, 1923, 409-10. 
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Real numbers 5 

1*03. Real numbers. Most of the present chapter will be already familiar to those 
who have studied a good modern book on calculus, and it is not intended to compete with 
standard works on pure mathematics. We think, however, that some discussion here is not 
out of place, for several reasons. First, the latter works for the most part do not emphasize 
why the refined arguments that they give have any relevance to physics, and physicists 
therefore tend to believe that they are irrelevant. Secondly, they are liable to be so long 
that a physicist can hardly be blamed if he decides that he has not the time to work 
through them. Thirdly, the attention to very peculiar functions has led the subject to 
be regarded as the pathology of functions. The reply is that every function, except an 
absolute constant, is peculiar somewhere, and that by studying where a function is 
peculiar we can arrive at constructive results about it that would be very hard to obtain 
otherwise. But we are entitled to regard ourselves as general practitioners and to restrict 
ourselves to the kinds of peculiarities that occur in physics; rare diseases may be handed 
over for treatment to a specialist, in this case a professional pure mathematician. 

The nature of the problem was foreshadowed in a theorem of Euclid that the ratio of 
the hypotenuse to one side of an isosceles right-angled triangle is not equal to any 
rational fraction. Euclid, it must be remembered, made no use of what we should now 
call numerical measures of physical magnitudes. When he said that two lines were equal 
he meant that one could be placed on the other so that the two ends of one coincided with 
the two ends of the other; this is the direct physical comparison and does not require any 
numerical description of the lengths. When he said that the square on the hypotenuse 
was twice that on a side he meant that it could be cut into pieces and that the pieces could 
then be put together so as to make the square on the side twice over. He was working 
throughout with the quantities themselves, not with the numbers that we choose to 
associate with them in measurement with regard to any special unit. The use of numbers 
for this purpose is a choice of a language. What Euclid’s theorem showed was that the 
language of rational numbers was incapable of describing simultaneously the lengths of 
the side and the hypotenuse of a triangle that could easily be drawn by the rules of his 
geometry. 

Measurement in terms of a unit is too useful a procedure to be lightly abandoned, and 
it could be retained, consistently with Euclid’s theorem, in any of the following ways: 
(1) Since an infinite number of pairs of integers x, y can be found such that x 2 + y 2 = z 2 , 
where z is another integer, and so that xjy is as near 1 as we like, we could suppose that the 
sides of a right-angled triangle satisfy x 2 +y 2 = z 2 exactly but that x = y is not true 
exactly but only within the errors of measurement, and the sides are always exact mul¬ 
tiples of some definite length. (2) We might say that x/y can be exact but x 2 + y 2 — z 2 is 
only approximate. (3) We can say that the language of rational numbers is not enough 
for what we need to say, and that we need a fuller language in which x — y and x 2 +y 2 = z 2 
can be both said consistently. The last alternative is the one that has been universally 
adopted by the admission to arithmetic of irrational numbers. It does not contradict 
Euclid s axioms, the first does, since he assumes that a line can have any length, 
and the second contradicts one of their best-known consequences. An experimental 
proof that it is right is impossible because either (1) or (2) could be true within the 
errors of measurement even if x, y, z were restricted to be integers. But they would 
be intolerably complicated, and the adoption of either would require the existence 
of an unknown and indeterminable standard of length such that all actual lengths are 
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Nests of intervals 

exact multiples of it, besides abandoning the simplicity of Euclid’s rules without experi¬ 
mental reason. The universal practice in physics is to adopt alternative (3) and create a 
language of sufficient generality. We introduce real numbers and assume that the opera¬ 
tions of addition, subtraction, multiplication and division can be applied to them in such 
a way that the same fundamental rules as for rational numbers are satisfied, and that an 
ordering relation satisfying rules (10)-(15) can be defined. They differ from the rationals 
in possessing a certain property of completeness , which ensures, for instance, that there is 
a real number ^2 whose square is 2. It is not obvious that this can be done without incon¬ 
sistency (and it was certainly believed for 2000 years that real numbers were meaningless*), 
but the 19th century investigations of Dedekind, Cantor, and others have established 
their workability for all practical purposes. That is enough justification for our pur¬ 
poses. But the logical justification involves the consideration of infinite collections. 
It is indeed obvious that the evaluation of *]2 by root extraction or by successive 
approximation to a continued fraction, if taken to a finite number of steps, can never 
yield anything but a rational number; to give any exact meaning to <J2 in numerical terms 
requires an infinite number. Euclid’s procedure does lead in a finite number of steps to 
a ratio that can be identified with f2, but does not describe it in a numerical way, and 
the proof that his axioms are themselves consistent has so far been completed only by 
way of the numerical approach. The notion of f2 is accepted at school largely because 
we believe that a consistent system of measurement of physical objects is possible and 
Euclid’s axioms look plausible; but we forget that the Euclidean triangle is not the real 
triangle, or, if we remember, we think that the real triangle is an imperfect representation 
of the Euclidean one. Physically the Euclidean triangle is an idealized approximation 
to the real one, and we cannot take it for granted that the idealization does not introduce 
new troubles of its own. 

1*031. Nests of intervals: Dedekind section. The fundamental property of real 
numbers is that they can be approximated to as closely as we please by rational numbers. 
When we say that 

72=1-414..., 

we assert the following set of propositions: (1) 2 is between l 2 and 2 2 ; (2) 2 is between 
1-4 2 and l*5 a ; (3) 2 is between 1-41 2 and 1-42 2 ; (4) 2 is between 1-414 2 and 1-415 2 ; and so 
on to any desired accuracy. At each stage this process can be regarded as separating the 
decimals, to a given number of places, into two classes, those whose squares are respec¬ 
tively greater or less than 2. At stage 3, for instance, the squares of 1-414, 1-413, 1-412 
are less than 2, those of 1-415, 1-416, 1-417 greater than 2. We say nothing at this stage 
about the fractions 1-4141, 1-4142, ..., 1-4149; but at the next stage we say that 2 lies 
between the squares of 1-4142 and 1-4143. By taking a sufficient number of decimals 
we can make the unconsidered interval as small as we like, since we divide it by 10 at 
each step. Thus any decimal with a finite number of places will ultimately be classified 
according as its square is less or greater than 2. Now this process determines a unique 
infinite decimal, which we can take to be ^2, and it can be regarded as the limit approached 
by the successive approximations from either side. 

This process, which is capable of great extension, is an example of the definition of a 


* Hen.ce the name ‘irrational numbers’. 
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real number by a nest of rationale. We take a succession of rationals {a n } and another 
succession {6 n }, satisfying the following conditions: 

(i) a n+1 >a n , 

(ii) K+i^K, 

(iii) a n < b n , 
for all n, and 

(iv) Given any positive rational number e, a number N can be found such that 

b n — a n <e for every n>N. 

Such a nest { a n | b n } can be used as a definition of a real number. A member a n , b n of the nest 
consists of the set of rationals greater than or equal to a n and less than or equal to b n . 
The real number defined by the nest lies between the end-points of all its members. 

A nest may turn out to define a rational number. For instance, if we consider decimals 
whose squares are respectively just less and just greater than 2-25 we get the nest 1,2; 
1*4,1*6; 1-49,1*51; 1*409,1*501; .... The only decimal lying between the end-points of all 
members of the nest is 1*5, whose square is in fact 2*25. For every rational we can construct 
such a nest, so that the rationals themselves are real numbers. 

A single real number can be defined by many different nests. For instance, instead of 
dividing the interval by 10 at each stage we could divide by 2, in this way generating 
a binary fraction or ‘decimal to base 2’. It would take more than three times as many 
steps to get as good an approximation, but the process defines the same real number as 
before. Two nests {a n 1and {a n | /? n } define the same real number if and only if a n , b n con¬ 
tains a m , for sufficiently large m, and a n , fi n contains a m , b m for sufficiently large m ; in fact 
only one of these conditions need be known to hold—the other follows as a consequence. 

We now come to the most important property of the real number system. We abandon 
the condition that a n , b n shall be rational and consider a nest {a n | 6 n } where a n and b n are 
now real numbers. An interval of such a nest consists of the set of real numbers greater 
than or equal to a n and less than or equal to b n . In condition (iv) e is now any positive real 
number. It can be proved that there is one and only one real number lying in every 
interval of the nest. In other words, if we apply to the real numbers the process that we 
have applied to the rationals, we get nothing new, but remain within the system that we 
have already defined. This is the property of completeness mentioned in 1*03. 

Another important way of defining real numbers is by a Dedekind section or cut. If the 
rational numbers are divided into two classes L and R such that every member of L is 
less than every member of R, there is only one real number greater than or equal to every 
member of L and at the same time less than or equal to every member of R. If this real 
number is rational, then it will be either the greatest member of L or the smallest member 
of R. For instance, L might consist of the negative rationals together with 0 and the 
positive rationals whose squares are less than 2, and R of the positive rationals whose 
squares are greater than 2. This cut defines the real number y/2. 

Dedekind section arises most naturally when the numbers are classified according 
as they possess or do not possess a certain property. For instance, ‘x has a square 
not greater than 2*25’ defines an L class, the largest member of which is 1*5; ‘x has 
a square less than 2*25’ defines an L class with no largest member, and 1*5 is the 
smallest member of the R class, ‘x is rational and has a square less than 2’ defines 
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L and R classes of rationals with no largest and no smallest member respectively. 
( x is real and has a square less than 2’ defines an L class with no largest member 
and an R class with smallest member f2. 

In terms of the Dedekind section, the completeness property of the real number system 
is equivalent to the statement that any cut in the real numbers defines a real number. 
Thus many problems that have no answer in the rational number system can be solved 
in terms of real numbers. We have so far considered only *J2, but we are also ready for 
n and e when they turn up, and shall not need to search for a statement of each problem 
in such a form that it can be solved in rational numbers. The use of the real n um ber 
system therefore avoids a lot of complications with no relevance to physics. 

The methods of nested intervals and of Dedekind section are equivalent. If L and R 
classes exist we can form a nest of intervals, taking a x , a 2 ,... from L and b x , b 2 ,... from R , 
in such a way that the conditions required for a nest of intervals are satisfied. Conversely, 
if a nest exists, some rationals r will be exceeded by a m for some m, others will not be 
exceeded by any a m . These inequalities define an L and an R class and the conditions for 
a cut are satisfied. 

If the nest (a m , b m ) defines a positive real number x, (1 /6 m , 1 /a m ) will define 1 /x. Then if 
nests (a m , b m ) (a' m , b' m ) define x, x’, (a m a’ m , b m b' m ) will define xx'. (- b m , - a m ) will define - x, 
and whether x, x ' are positive or negative, if (a m , b m ) defines x and (a' n , b’ n ) defines x', then 
(a TO + a' m , b m + b' m ) will define x+x'. Thus all the operations of addition, subtraction, 
multiplication and division are defined for the real numbers and can be shown to satisfy 
the fundamental rules. Full details are given by Knopp.* 

Neither method proves the existence of irrational numbers, but both show that they can 
be used consistently and that any proposition proved by using them can be interpreted as 
a true proposition about rational numbers (usually, of course, much more complicated to 
state). In Principia Mathematica the aim is somewhat more ambitious: a real number is 
interpreted as a class of rationals (essentially the Dedekind L class) and meanings are 
given to the laws of algebra in terms of certain operations on these classes; and the laws 
so stated are proved to be true. In this sense there is an actual proof of the existence of 
irrationals satisfying the laws of algebra. 

1*032. e; indirect proofs. A peculiarity of the basic theorems about real numbers is 
that many of them seem incapable of direct proof. They are proved by the process known 
as reductio ad absurdum. We have to state the contradictory of the theorem and show 
that this itself leads to a contradiction; and then we argue that the theorem cannot be 
false and therefore must be true. But since most of the theorems have conclusions of the 
form x = y, their contradictories are inequalities of the form ‘x<y or x>y\ Most be¬ 
ginners find it much more difficult to handle inequalities correctly than equalities, and 
of all the difficulties found in mathematical physics the greatest found by many students 
is in learning to approximate. That is why lower marks are obtained in problems of small 
oscillations in dynamics and of potentials of nearly spherical bodies than in any other part 
of the Mathematical Tripos. Nature does not consist entirely, or even largely, of problems 
designed by a Grand Examiner to come out neatly in finite terms, and whatever subject 
we tackle the first need is to overcome timidity about approximating. A difference 
between the theory of the real variable and dynamics is that in the former we are willing 


* Theory and Application of Infinite Series. 
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to consider arbitrarily close approximations carried to any number of stages, whereas in 
the latter we only want an approximation close enough for the practical end in view. 
But experience in the one will tend to produce confidence in the other. 

The simplest type of argument of this form is: if x ^ 0, and x<e, where e is positive but 
can be chosen as small as we like, then x — 0. For no value of x greater than 0 can be less 
than every positive e. An immediate extension is obtained by considering the modulus 
or absolute value of x, denoted by | a? | and read ‘ mod x ’. This is equal to x when x is positive 
or zero, and to — x if x is negative. It is therefore always ^ 0. Then if | x | < e for all positive 
e, | x | = 0 and therefore x = 0. Note that | x | + | y | ^ | x + y |, | x-y | ^ | x \ - | y |. 

It is necessary for this argument to use a symbol for the small quantity. If we said 
‘e = 0*001 ’, and proved that | x | < 0*001 by calculation, an objector might say ‘you have 
not proved that x = 0; it might be 0*0001 ’. The symbol e, to denote an arbitrarily am all 
quantity, prepares us for such an objection, since by proving that | x | is less than any e 
we are ready to disprove any value of x, other than 0, that an objector might suggest. 

The essential point is that we are concerned with processes that in the most general case 
could be completed only in an infinite number of steps, e.g. showing that two nests of 
intervals determine the same real number. We overcome this and obtain a finite proof by 
saying that if a =f= 6, | a — b | has a definite value M, which is not zero. If, then, we can show 
that M < e for every positive e, it follows that M = 0, contradicting the hypothesis, so 
that a and b must be equal. 

1*033. Sets. A limit-point of a set of numbers is a number x such that for any e> 0 
there is a member of the set, y, different from x, such that \ y — x\<e. It follows that there 
are infinitely many values of y satisfying this condition. For by definition there is one; 
call this y x and take a new e, say e 15 less than | y x -x\. Then there must be another y of 
the set, say y 2 , such that 0 < | y 2 — x j < e v The process can evidently be continued 
indefinitely.* 

Clearly no finite set can have a limit-point. But an infinite set also may have none; 
consider the set of all integers. No member has another within distance 1 of it, and no 
number not an integer can have more than one within distance In the set of rational 
numbers every member is a limit-point since there is a rational number as near as we 
like to any other. The same applies to the real numbers. A set may have only one limit- 
point; consider for instance the numbers nr 1 , where n can be any integer. There are 
infinitely many within any finite distance from 0, which is therefore a limit-point; but 
around any other number, rational or not, we can take an interval that contains no 
member of the set, other than the number itself if it is a member. A limit-point of a set is 
not necessarily itself a member of the set. We can, for instance, make a set of rational 
numbers whose limit-point is *f2 by taking the successive approximations to *J2 by 
decimals, but *J2 itself is not a rational number. 

If all the limit-points of a set are themselves members of the set, the set is said to be 
closed. An interval a < x < b as defined in 1*031 is a closed set and is called a closed interval. 
The corresponding open interval is a < x < 6. We shall return to this distinction in 1*061. 

1*034. If a set has infinitely many members within a finite range a^x^b, then it has at 
least one limit-point x such that a^x^b. For if we bisect the range, one half at least must 

* It should be noticed that expressions such as ‘the process can be continued indefinitely’ and 
‘and so on’ cover applications of mathematical induction. We shall seldom state such arguments in 
full, for reasons of space. The student should, however, complete some of them for himself for practice. 
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contain an infinite number of points of the set; bisect that half. One half again contains 
an infinite number, and we see that by repeating the process we can find an interval as 
small as we like containing an infinite number of points of the set. But this corresponds 
to the method of specifying a real number by a nest of intervals and therefore identifies 
a real number such that any small interval about it contains an infinite number of points 
of the set. It is therefore a limit-point of the set. This is known as the Bolzano-Weierstrass 
theorem. 

1*035. An infinite set is enumerable if its members can be paired with the positive 
integers in such a way that to each member corresponds one and only one positive integer, 
and vice versa. Thus the squares l 2 ,2 2 ,..., w 2 ,... form an enumerable set, since to each n 
corresponds one n 2 and to each n 2 one n. The rational fractions between 0 and 1 form 
another, for they can be arranged |, £, f, |,f, f,f,and the one that occurs in the 

nth place can be paired with n. The whole of the positive rationals form another, since 
they can be arranged i, £, f, f, J, f, f, f,.... Here the numbers are arranged in groups, 
the sum of the numerator and denominator being the same for all in each group and greater 
by 1 than in the previous group, while those in each group are arranged in order of 
increasing numerator. In these two cases the comparison with the positive integers 
requires complete rearrangement from the natural order. 

Not all infinite sets are enumerable. Far the most important exceptions are the set of 
all real numbers and the set of all real numbers within a given finite interval. Cantor 
proved that however we may try to put them into a one-one correspondence with the 
positive integers there will always be some omitted. 

1*036. Necessary: sufficient. If two statements denoted by I and II are so related 
that if I is true, then II is true, we say that I is a sufficient condition for II and II is a 
necessary condition for I; that is, I cannot be true unless II is true. If II is true if and 
only if I is true, then I is a necessary and sufficient condition for II, and vice versa. In 
this case we may also say that I and II are equivalent. 

In general if a necessary and sufficient condition can be stated for the truth of a 
given proposition several can. For instance, a necessary and sufficient condition that x, 
a real quantity, shall be 0 is | x | < e for any assignable positive e; but others are x 2 — 0 
and x 3 = 0. A necessary and sufficient condition that ax 2 -2bx + c>0 for all a; is that 
a > 0, ac — b 2 > 0; but another is that c > 0, ac — b 2 > 0. 

A necessary and sufficient condition may contain superfluous information. For 
instance, if ax 2 — 2bx + c > 0 for all x, we must have a> 0, c>0, ac — b 2 >0, and con¬ 
versely. Hence a > 0, c > 0, ac > b 2 is a necessary and sufficient condition. But if ac > 6 a , 
either a > 0 or c > 0 implies the other and one of them is superfluous in the sense that it 
follows from the other information given. On the other hand either a > 0, c > 0, or ac > b 2 
by itself would not guarantee that ax 2 -2bx + c>0 for all x: none of these conditions 
alone is sufficient. A set of necessary and sufficient conditions for the truth of a 
proposition is called minimal if the conditions left when any part of them is removed 
are not sufficient. 

1*04. Sequences.* In considering the properties of a set we are not restricted to 
taking the members in any particular order. In the argument of 1*034, for instance, the 

* Fuller discussions of sequences than are possible here will be found in K. Knopp’s Theory and ' 
Application of Infinite Series and in Hardy’s Pure Mathematics. 
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points actually in any rang© are determined by the specification of the set, just as, if we 
put some balls into a box, what balls are in the box has nothing to do with their rearrange¬ 
ment by shaking or sorting. 

When we come to study properties essentially connected with a particular order we are 
dealing with sequences. The numbers 1, 2, 3, ... in ascending order constitute a sequence; 
if they were rearranged, but in such a way that we always knew where to find a particular 
one, they would form a different sequence but the same set. If we write s n for the nth in 
a given arrangement, the property 8 n+1 — s n = 1 is true for all n for the original order but 
for no other. In general if s n is completely specified when n is given , s n may be described as 
a function of the positive integral variable n, and the values s 1} s 2 , for successive 

values of n, form a sequence. (Those who have some knowledge of series often suppose at 
first that the terms of a sequence are to be summed, but this is not so.) Both 


1 l i 

’ 2’ 3’ 


( 1 ) 


and 


1 , 2 , 1 , 2 , 1 , 2 , ... ( 2 ) 

are sequences. In the first the members are the members of an infinite set arranged in a 
certain order. In the second they are the members of a finite set repeated over and over 
again. 

A sequence whose general term is s n can be denoted by {«„}. 

1*041. Bounded, unbounded, convergent, oscillatory. Let M be an arbitrary 
positive number; it is possible that whatever M we take there is at least one value of s n 
such that \s n \>M. Such a sequence is called unbounded. s n = n is an obvious example, 
for we need only take n to be any integer greater than M. By an argument similar to that 
for limit-points, an unbounded sequence must have an infinite number of terms such that 
| s n | is greater than any assigned M. 

If we can choose an M such that all | s n | are less than M, the sequence is called bounded. 
Both the sequences given at the end of 1*04 are bounded; the condition holds for both if 
M = 3. 

If there is a number s such that, given any positive number e, we can choose m so that 
for every n>m |, n - s |<e, (1) 

the sequence is said to be convergent, and to have limit s. We then write* 

!/ —> OO), 


8„~>8 


( 2 ) 


or 


lim s n = s. 

n->oo 


The arrow is read ‘tends to ’. We can write simply 

lim s n = s, (3) 

if no ambiguity is possible. Of the above examples 1-04(1) is convergent with limit 0; 
we need only take m > 1/e. 1-04 (2) is not, because whatever s and m we take, if e < 
there will be terms with n>m such that | s n — s | \>e. 

The most important property of a convergent sequence is that if we have a rule for 
calculating each term, then we can calculate the limit to any accuracy we like. Some 
* J. G. Leathern, Volume and Surface Integrals used in Physics, 1905. 
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methods of approximation (cf. Chapters 9,17) will prove that a quantity lies within a given 
range, but this range is not arbitrarily small; the accuracy may be enough for the 
application in view but is not capable of being improved indefinitely. 

A sequence that is bounded but not convergent is said to oscillate finitely, or simply to 
oscillate. An example is 1*04(2); another is 

s “ = ( - 1) “+^ < 4 > 

Unlike 1*04 (2), all s n are different. The sequence is bounded, because | s n j < 2 for every n; 
but it does not converge since for large n the members are alternately near to 1 and — 1, 
and (1) cannot be satisfied if e < |. 

If for any If there is an m such that s n > M for all n > m, we write 


s n -> oo. 

= n and s n = n 2 are examples. 

If for any M there is an m such that s n < - M for all n > m, we write 


(5) 


s n = —n and s n = —n 2 are examples. 

Other types of unbounded sequences are represented by 

s n = (- l) n n, s n = n cos \nn, s n = n (1 - cos nn). 

These cannot be said to tend to anything particular, not even infinity, and are sometimes 
called infinitely oscillating. Unbounded sequences can be called divergent ; but different 
writers use this term in different senses, some (e.g. Bromwich and Hardy) excluding 
infinitely oscillating sequences and some (e.g. Knopp) including finitely oscillating ones. 
A useful device is to classify sequences according as they have or have not the properties 

(1) for any m, and any positive M, there isann>m such that s n > M, 

(2) for any m, and any positive M, there isanw>m such that s n < — M. 

Sequences with neither property are bounded. If a sequence possesses (1) but not (2), 
it is bounded below, unbounded above, and similarly for the other two cases. 

Note that no definite meaning is attached to infinity as such. What we do is to give 
meanings to all the expressions that contain the word infinity or the symbol oo. s ->oo 
is a shorthand statement of the property of (s n ) stated in the definition of ‘« n ->oo\ and 
does not imply the existence of any real quantity denoted by oo. 

Infinity is excluded from the rules of algebra, not because there is any inconsistency 
in the notion of infinite numbers, but because they follow different rules. In fact 
the notion of an infinite set is implicit in most of our theory, since there are infinitely 
many values of a; in any interval of x. A consistent algebra of positive infinite numbers 
was set up by Cantor, and has been extended by many later writers. But it is different 
from ordinary algebra. If a and 6 are positive infinite numbers we can define a + b and ab 
uniquely; but a + b need not be greater than a—in fact it is in general equal to a or to b. 
It is not possible to define a-b and a/6 uniquely. Consequently an algebra that includes 
both finite and infinite numbers must still distinguish between them in its rules. 

1*042. If an infinite set has a limit-point, s, then we can form a sequence from its members 
whose limit is s; if it has more than one limit-point we can form sequences tending to any of 
them. 
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We have shown (begi nning of 1*033) that there is an infinite number of members of 
the set within a given distance of a limit-point; if we take specimens in the order indicated 
we have a sequence with the property required. 

1*043. Any sequence formed of different members of a bounded infinite set with only one 
limit-point s will converge to the limit s. It is clear that in forming a sequence from the set 
we have a choice at every stage; hence the number of different sequences that can be 
formed from the set is infinite. We have to show that they all have the same limit. For 
any m, the number of terms s n of a sequence with n > m is infini te. But since the set is 
bounded and has only one limit-point, any interval not including the limit-point can 
contain only a finite number of members. Hence for any e only a finite number of members 
lie outside the range s ± £e, say s a , s fi ,..., s^. Let m be the greatest of a, /?,..., /i. Then for 
all n greater than m , | s n — s | < \e < e, and therefore the sequence converges to s. 

The result does not follow if the members of the sequence are not required to be dif¬ 
ferent and some can recur infinitely often. For instance, if the set is that of the reciprocals 
of the integers, its only limit-point is 0; but if repetitions are allowed we can form from 
it the sequence 



which is oscillatory. If no member recurs more than a fixed number Jc times, however, 
the result still follows by a simple extension of the argument. 

1*044. Upper and lower bounds. A set (or sequence) bounded above has an upper bound; 
and one bounded below has a lower bound . The upper bound of a set is a quantity M such that 
no member of the set exceeds M, but if e is any positive quantity, however small, there is 
a member that exceeds M—e. The lower bound is a quantity m such that no member is 
less than m, but there is always one less than m + e. 

We use the method of Dedekind section. There are quantities a such that a is exceeded 
by some member of the set; for we might take an a less than a known member of the set. 
Since the set is bounded above, there are quantities b that are not exceeded by any member 
of the set. Every b is greater than any a , and every quantity of the same dimensions is 
either an a or a 6. Hence the quantities a form an L and b an R class, and determine 
a cut, say at M. M is a member of the R class. For if it was a member of the L class it would 
be exceeded by some member of the set, say K, and there would be no quantities 6 between 
M and K; hence M would not be the quantity given by the cut. Hence no member of the 
set exceeds M. Also M — e is in the L class and therefore is exceeded by some member of 
the set. The corresponding result for lower bounds follows similarly. 

The argument does not suppose the set infinite; but for a finite set the greatest of the set 
is the upper bound. For an infinite set all members may be less than the upper bound; for 
the set 1*04(1) the upper bound is 1 and is equal to the first term, but the lower bound is 
0 and no actual member is 0. 

What we call the upper bound is often called the least upper bound ; and any quantity such 
that no member of the set exceeds it is then called an upper bound. 

Note that if s n < t n for all n, and s n -> s, t n -+1, then s^t, not s<t. Consider s n = 1 — 2-*, 
t n =l — 3“". Here s = t. We may regard (s n , t n ) as an interval whose length tends to zero, 
but these intervals do not constitute a nest because each is not part of its predecessor, 
and, in fact, the whole of each interval is on the same side of the limit. 
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General principle of convergence 

1*0441. Ifs n > s n _ x for all n, and the sequence is bounded, then the sequence converges. Let 
the upper bound of s n be s. Then for all n, s n ^ s. But also for any e there is an m suoh that 
s m > s — e; and then for every n>m 

8>s n >s m >s-e, 

and therefore the sequence converges with limit s. 

1*045. The general principle of convergence. A necessary and sufficient condition 
for convergence of a sequence {s n } is that for any positive quantity e there is an m such that for 
all n^m, 

(!) 

We show first that the condition is necessary. Suppose that s n ->s. We have to show 
that m exists such that (1) is true. For any positive (o we can take m so that | s n — s | < o> 
for all n^m. Then | s n — s m | < 2o) for all n ^ m. Take o) = |e; then (1) follows. 

To prove that the condition is sufficient, we notice first that the sequence is bounded, 
for, given any positive a), there is an m such that J s n — s m | < a) for all n ^ m, and s x , s 2 ,. . •, s m 
are all finite. We define a n and b n as the lower and upper bounds respectively of s p for 
p^n. Clearly 

a n ^ a n +1 ^ b n +1 < b n . 

Also since | s p — s q | < 2a) for p and q greater than m, we have that for n^m, b n — a n ^2(o 
since b n — a n is the upper bound of s p — s q for p,q^n. Since o) is arbitrarily small, 
b n — a n —> 0. The intervals (a n , b n ) therefore form a nest, defining the real number s, say. 
Since 

VLtw. ^ S ^ b/y. 1 
n x v n N n I c n 

}- for all n 

and a n ^ s <bj 

we have | s — s n | ^ 2a) for n^m, 

that is, s n -> 5 as n -> oo. 

The device of introducing a subsidiary arbitrarily small positive quantity, usually 
denoted by cj, 8, or y, which is later defined as a fraction of e, will be met frequently in 
theorems where the quantity to be proved less than e is expressible as the sum of several 
parts. 

1*05. Series. If the nth term of an infinite series is u n , the sums 

s 1 = u 1 , s 2 =u 1 + u 2 , 8 2 = u x + u 2 + u z , ..., s n = u x + u 2 +...+u n , ..., 

constitute a sequence. If this sequence is convergent we say that the series 

= u x +... +u n +..., 

where 'n is now made indefinitely great, is convergent; and we call the limit of s n the sum 
of the series.* If {s n } is not convergent but finitely oscillating we shall speak of the 
series as finitely oscillating. 

To every theorem about sequences corresponds one about series; for if {s n } is a sequence, 

n 

and we take u x = s 1} u n = s n — s n _ x for n> 1, ^u r = s n . 

i 

* As for sequences, different definitions of divergent are in use; some writers restrict the term to 
cases where s n -> oo or s n -* — oo, others call all non-convergent series divergent. 
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The geometric series is 
Here, if ar#= 1, 


= 1 +x+x 2 + .... 

»-0 


s„ = 


1 — x n+1 
1—x : 


and | x n+1 1 becomes indefinitely large with increasing w. if | a; ( > 1. If | x [ < 1, x n+l tends 
to 0.* Hence the series is convergent if | x | < 1, but not if | x J > 1. If x = 1, the sum of 
n terms is n, and the series is not convergent. If x = — 1 the sum of any odd number of 
terms is 1, but that of any even number of terms is 0. The series therefore oscillates finitely. 
A necessary and sufficient condition for the series to converge is therefore | x | < 1. 


00 11 

!>-*= 1+-+^ + .... 
«=o 2* 3 X 


The Riemann £ series is 
First take x > 1. We can take the terms in batches: 

s n - 1 + (^ + ^) + (4S + ^ + ^ + ^) + ”*+ (•••+-*) 

and the sums in brackets after the first are respectively less than 

_2 4 _ 1 1 

2* ’ 4*’ 


Then, if m — 2 r_1 , and n ^ m, 


s„—s„ 


2 X ~ 1> ( 2*- 1 ) 2 * 
2-M.x-l) 


which can be made < e by taking r large enough. Hence the £ series converges ifx> 1. 

If x = 1, we write 

s n ~ 1 + i + (i + i) + (i+i + y + i) + •••, 

and all the sums in brackets exceed Hence the series is not convergent; s n ~>-oo. All the 
terms after the first are increased if 0 < x < 1; hence again s n -> oo. 

The related series for log 2 is 

W+W+.... 


Here 5 s ==+ //_i-M + /_J_M + ) 

n m “Hm + l m + 2/ + \w + 3 m + 4/ + "y 

and the sum in brackets is > 0 whether n — m is even or odd. But also 
s„—s m = + 


m 


_U-P_U_ ) 

+ 1 \m + 2 w + 3/ \m + 4 m + 5J "}’ 


and every expression in brackets () is positive. Hence 

0 < I s„ — s ' 1 


m + V 


and this is less than 6 for all n> m if {rn + 1) > 1 Je. Hence the series is convergent. 

* Strict proofs of these apparently obvious statements will be found in Hardy’s Pure Mathematics 
pp. 134-5. 
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The argument can be adapted at once to show that if u n > 0, u n >u n+1 for all n, and 
u n -> 0, then the series 

is convergent. 

1*051. Absolute convergence. If the series £ | u n | converges, £ u n converges; for the 
sum of any batch of terms u m to u n cannot have a modulus greater than the sum of the 
corresponding terms \u m \ to \u n \. In this case £ u n is said to be absolutely convergent ; if 
£ u n is convergent but £ | u n \ is not, £ u n is said to be conditionally convergent. (The word 
semiconvergent is sometimes used, but the prefix is misused, and the same word is also 
used for asymptotic series, which are best not regarded as infinite series at all. This word 
is therefore best avoided.) 

We have seen that the series obtained from that for log 2 by taking all the signs positive 
is not convergent. Hence the series for log 2 is conditionally convergent. The geometric 
series, if convergent at all, is absolutely convergent. 

1*052. Rearrangement of series. The sum of an absolutely convergent series is unaltered 
by taking the terms in any order. Let £ u n be absolutely convergent, with sum s, and £ v n > 
the same series, but with the terms differently arranged. It is understood that every term 

of either series appears in the other, but not in general in the same place. Take an arbitrary 

00 

positive quantity o) and choose m so that 2 \u n \<ov, then the sum of the moduli of any 

n=m+1 

batch of terms after the rath formed from the first series is less than (o. Take ra' so that all 
the terms u n up to u m appear in the second series for values of n' less than ra'. Write 

s m = u 1 + ...+u m , s' m , = v 1 + ...+v m ,. 

Then s' m . — s m is the sum of a set of terms of the first series after the rath and its modulus 
is < o). Also if we take n' > ra', s' n > — s' m > is the sum of another set of terms of the first series 
after the rath and therefore its modulus also is < (o. Hence the second series is convergent. 
Let its sum be s'. Then 

\S-S'\ = |(5-5 m )-(5'-4')-(«m'-«J|<3&>, 

and can therefore be proved less than any arbitrary e by taking o) = Je. Hence the two 
series have the same sum. 

The theorem is not true of conditionally convergent series. It can be shown that if 
£ w m is conditionally convergent we can rearrange it so as to make the sum anything we like. 
They have a precise meaning when the order of the terms is given, but not otherwise. 
They usually converge too slowly to be of much use for computation, but they can be used 
in theoretical work. 

Tests for convergence based on the use of ‘comparison series’ are so closely related to 
tests for uniform convergence that we shall postpone them till we discuss the latter 
property (1*115, 1*117). 

1*053. Double series. Similar remarks apply to double series, in which the general 
term is u m n . The condition of convergence is now that we can choose ra, n so that for all 

V Q m n 

p greater than ra and all q greater than n, the sums 2 £ u r ,s> £ £ u r,s differ by a 

r=ls=1 r=ls=l 
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quantity with modulus less than e. Absolute and conditional convergence can be defined 
similarly, and it is again true that an absolutely convergent double series has the same 
sum however the terms are arranged. The proofs differ only in complexity from those for 
simple series. 

1*06. Limits of functions: Continuity. In the most general sense, when we say 
that f(x) is a function of # in some range of values of x we mean that for every value of x 
in the range one or more values of f(x) exist. We can, for instance, speak of a function of a; 
that is equal to 1 if a: is rational but to 0 if a; is irrational. Such a function would be fairly 
regarded by a physicist as pathological, and he is interested in a much narrower class of 
functions, roughly speaking such as can be represented by graphs.* It will usually also 
be required that the function shall be single-valued, but not necessarily. Thus for the circle 

x % +y 2 = a 2 , 

we have y = ± ^(a 2 - a; 2 ), 

and y is a function of x\ but we get its values over the whole circle only by taking both 
signs for the root. A single-valued function of a; in a range is one that has precisely one 
value for each value of x. We shall in the first place consider single-valued functions only. 

The essential idea of a limit of a function is similar to that of the convergence of a 
sequence; for the terms of a sequence {sj are the values of a function of the positive 
integral variable n, which is permitted to take arbitrarily large values. The new feature 
is that for a function/ (x) the variable x is not restricted to be integral; it may be permitted 
to take any value over an interval or even any value however large. 

When the values of x form an interval we can define a limit of f(£) as£-^a;as follows: if 
there is a quantity c such that given any positive e there is a positive 8 such that whenever 
0< | £-x | <8, then \f(£)-c | <e, we say that c is the limit of f(£) as £^x. (We may 
further restrict the admissible values of £ and, for instance, speak of the limit of /(£) as 
£-x->0 through positive or negative values.) If also c =f(x), we say that f{£) is con¬ 
tinuous at £ = x. Then the definition of continuity may be stated as follows: if for any 
positive e we can choose a positive 8 such that whenever | h | < 8 

\f(a + h) —f(a) | <e 

thenf(x) is said to be continuous at x = a. If this condition is satisfied and we take any 
sequence tending to 0, then for any 8 there will be an m such that | h n \ < 8 for all 
n'&m, and then | f{a + h n ) —f(a) | < e. Hence for all such sequences f(a + h n ) ->/(«). 

Most functions met in practice are continuous, with at most a finite number of points 
of discontinuity. A common type of discontinuity is where f(x + h), for some value of a, 
has one definite limit as h —> 0 through any set of positive values, and a different one 
as h -+0 through any set of negative values. Such a case is called an ordinary or simple 
discontinuity. For instance, if 

f(x) = 0 (x < 0 ), f(x) = 1 (x > 0 ), 

* This function is frequently used as a warning. It can be used for that purpose at once. We might 
try to define a pathological function as one that Is neither a continuous function nor the limit of one. 
But nothing could be more ordinary than the function cos 2n m!7ra, which tends to thi« function when 
n first tends to infinity and then m does. 


jmp 


z 
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the limit of f(h) as h-> 0 through any set of positive values is 1, and that as A,-»0 through 
any set of negative values is 0. This is a very common function in physical applications, 
since it represents, for instance, a force that begins to act on a system at a definite instant 
and thereafter is constant. It is usually known as the Heaviside unit function. The 
postage on a letter, considered as a function of weight, has simple discontinuities. The 
value at x — 0 usually does not need to be speci¬ 
fied in experimental applications, because for an 
object to be visible it must have some size, and 
therefore if a; is a position coordinate we cannot 
observe a quantity at an exact value of x, but 
only a mean value over a range. Similarly, if a; is 

a time we cannot observe a quantity at a single 

, _ . ” Simple discontinuity: the Heaviside 

moment but only over a non-zero interval. The unit function. 

usual tendency in pure mathematics is to insist 

that the function shall be specified for all values of the independent variable, but in 
physics it is usually enough that its integral shall be determinate. As the value of the 
function at a single point, provided it is finite, does not affect the integral, it is usually 
irrelevant to physical applications, and if a special value is assigned it is for the sake of 
convenience. 

The notations, for h > 0, 

lim/(a; + h) = f{x +), lim/(a; -h)=f{x~), 

h-+ 0 A-> 0 

are often used. Then the case we have been considering is one where 

/(0 + )*/(0 —). 

It may happen that f(a +) = f(a—) but is not equal to /(a). Such a function is said to 
have a removable discontinuity, but as f(a) does not affect the integral such discontinuities 
are not of much importance. It is, of course, impossible to illustrate by a graph. 

A limit will not exist at all if the function is unbounded in the neighbourhood of a 
value of x, as iorf(x) — ljx near 2 = 0. For any sequence of values of x tending to 0 ,f(x) 
will be unbounded. Again, if f(x) = sin (1/x), and x tends to 0 through the values 1 jmr, 
where n is an integer, the limit is 0. But if it tends to zero through the values 1 l(n + ^)nit 
tends to + 1 if n is restricted to be even and to — 1 if n is restricted to be odd. This kind of 
misbehaviour is the most troublesome to detect when the definition of the function is at 
all complicated, and also it is the kind that is most easily forgotten. 

The behaviour of f(x) as x -» oo is even more closely analogous to that of a sequence, since 
in general/(oo) is not defined directly and we are concerned entirely with the limit itself, 
if it exists. We note only the definition and the principal criterion. If there isac such that 
for any e > 0 there is an X such that for all x^X we have | f(x) — c | < e, then f(x) is said to 
tend to c as x->co. Analogously to the general principle of convergence for series, we can 
show that a necessary and sufficient condition that f(x) may tend to a limit as x->oo is 
that for any positive e there is an X such that for all x > X, | f(x) —f(X) | < e. 

1‘061. Continuity in an interval. f(x) is said to be continuous in an interval if it is 
continuous at every point of the interval. f(x) is continuous in the open interval a<x<b 
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if it is continuous for every value of x such that a<x<b. It is continuous in the closed 
interval a ^ x < b if this condition is satisfied and also 1 

/(»+)*/(«), f(b-)=f(b). 

Note that to say that f(x) is continuous at x = c implies that/(c) is finite; for otherwise 
we could attach no meaning to f(c + h) -f(c) at all. Similarly, iff(x) is continuous in an 
interval it is finite at all points of the interval. We shall prove that in the latter condition 
it is bounded in the interval if this is closed, but not necessarily if it is open. 

We denote any interval with end-points a, b by (a, b). When necessary we sha ll state 
explicitly whether a<x<b, a^x^b, a<x^b or a^x<b is to be understood.* 

Note that every point x of an open interval is an interior point; that is, there are points 
y,z of the interval such that a<y <x<z<b. This is not true for a closed interval since x 
may then be equal to a or to 6. But if a, 6 are finite they are limit-points of the set in a < x < b. 
Another way of expressing the distinction between closed and open intervals is to say that 
all limit-points of sets in a closed interval are members of the interval; those of sets in an 
open interval may not be, since those of some sets are the end-points. When we say that x 
is within an interval (or in later chapters within a region) we mean that it is an interior 
point; if we say that it is of a closed interval or region it may be an end or boundary point. 

Functions that are continuous except at a finite number of points, where they have 
simple discontinuities, are called sectionally continuous. 

A function is continuous if it is differentiable; the converse is not true, as we see from 
the example of^a;iii-0<a:<l. This is continuous in the interval, including the end-points, 
but is not differentiable at O.f Functions have actually been constructed that are con¬ 
tinuous everywhere in an interval but differentiable nowhere. As a rule we shall be con¬ 
cerned with functions that are differentiable except possibly at isolated points, but such 
points are very numerous in crystal physics. There is a theorem of Weierstrass that any 
continuous function can be represented as closely as we like by a polynomial throughout 
any finite range, or by a sum of sines and cosines with suitable coefficients (cf. 14-08). 
Consequently, though a continuous function is not necessarily differentiable, it can be 
replaced with as much accuracy as we like by a function that is differentiable. 


1*062. Covering theorems. We see that the property of continuity asserts that every 
point x of the interval {a, b) is in an interval (x — d,x + y)( where d and y may depend on x) 
such that (1) x is an interior point of the interval (except where x = a or 6, when it may be 
an end-point), (2) the length of the interval is not zero, (3) for every point £ of the interval 
a certain property holds, in this case 


I/(£)-/(*) | <e. 


* Special notations are in use for open and closed intervals, and a common practice is to denote the 
open interval by (a, b) and the closed interval by [a, 6]. In previous editions of this book ( ) and f 1 

were used in the opposite senses. L J 


t Nomenclature vanes between different writers in such a case. However we choose x H , positive 
and tending to zero, (Jx n -0)/x„ ultimately exceeds any given positive value. Iff(x) = xsin(llx) 
/(*«)/»« can be made to tend to any limit between -1 and 1 by suitable choice of the x . In the 
latter case, / (*) is said not to exist at x = 0. For f(x) = Jx, f'(x) would be said by many writers to 
be mfimte at x - 0. It is a matter of definition whether we say that f'(x) does or does not exist when 
\J(x + h) —f(x)}(h -> oo; we shall usually say that it does not. 
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We can show that in such circumstances it is possible to choose a, finite number of intervals 
such that every interval satisfies such conditions and every point of (a, b) belongs to at 
least one of them. For different purposes we need to make the choice in somewhat different 
ways, and two theorems are therefore needed to show that it is possible. 

1*0621. The Heine-Borel theorem. If every point of a closed interval ( a,b) is within 
some interval I of a family F, then there is a finite subfamily of F such that every point of (a, b) 
is within at least one interval of the subfamily. We say that I covers (c, d) if every point of 
(c, d) is an interior point of 7 (i.e. not an end-point). 

There may be an interval I belonging to F that covers the whole of {a, b). If so there is 
nothing to prove. If not, bisect (a, 6). There may be a pair of intervals I lt 7 2 such that 
every point of (a, \a + \b) is interior to I x and every point of (-|a + ^6,6) to / 2 . If either half 
is not included in an interval I, bisect that half. We say that in a finite number of steps we 
shall arrive at a stage where every portion of {a, b) lies within at least one interval 1. For 
if not, the successive bisection of intervals will give a sequence of intervals, each part of 
the preceding one, and each half the length of the preceding one, and none of them 
included in an 7. Such a sequence forms a nest of intervals and identifies a number x 0 
common to all its members. But by hypothesis x 0 is interior to an 7, say 7 0 , and hence there 
is a positive 3 such that all points of (x 0 — 3, x 0 + 8) are in 7 0 . Therefore all intervals of the 
nest whose lengths are less than 8 are included in 7 0 , and we have a contradiction. Hence 
the process of bisection leads in a finite number of steps to a set of subdivisions such that 
every division of (a, b) is wholly interior to some 7. Taking for each division an 7 that 
includes it we have the theorem. 

A slight modification is often made where an end-point, say a, is an end-point of an 
interval of the family, say I a , closed at a; I a is still supposed of non-zero length 8 a . Then 
a is interior to the interval J a {a — \8 a ,a + %8 a ), and the argument applies to the set of 
intervals J, where J is the same as 7 except that I a is replaced by J a ; every point of (a, 6) 
is an interior point of at least one J. But then the theorem follows with the modification 
that a may be an end-point of I a or b of I b provided that 7 0 has a as a member and I b 
has 6. 

The theorem gives the Bolzano-Weierstrass theorem (1*034) as a special case. If pos¬ 
sible, let (a, b) contain no limit-point of the set. Then every point of (a, b) is in an interval 
7 containing not more than one member of the set. Hence (a, b) can be covered by a finite 
set of such intervals 7 and therefore contains only a finite number of members of the set 
of points considered, contrary to hypothesis. 

In the argument as we have stated it the only intervals bisected at each stage are those 
not already covered by an 7. We could, however, equally well bisect all the intervals. For 
if 7 covers (c, d) it covers both halves of it. Hence ( a,b) in the conditions stated can be 
divided into a finite set of equal intervals each covered by an 7. 

1*0622. The modified Heine-Borel theorem. In the Heine-Borel theorem the 
intervals 7 may be specified by any rule so long as each is of non-zero length and every point 
of (a, b) is an interior point of at least one of them (except that a and 6 may be end-points). 
Sometimes, however, a further restriction is made, according to which each point x of 
(a, b) specifies an I x> of which x is an interior point. Then the following theorem holds. 
Suppose that every point x of a^x^b is within an interval I x {x-S x ,x-\-rj x ), where 8 X > 0, 
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V.x > 0, except that I a may be a ^ x < a + y a and I b may beb — 8 b <x^b; then (a, b) may be divided 
into a finite set of intervals such that each interval is part of the I x defined for some point of 
that interval. The proof is by successive bisection as before. Assuming the theorem false, 
we establish the existence of a nest of intervals converging to some a; 0 , such that none is 
part of I x for any x within that interval; but all of them less than a certain length are parts 
4,> an< l contain x 0 , and we have a contradiction. In this case, however, it does not follow 
that (a, b) can be divided into equal intervals with the required property. If (c, d) is covered 
by 4> where x is in (c,d), x can be interior to only one half of (c,d); then the other half 
is not necessarily covered by an I y , when y is now restricted to be in that half. 

An important application is to differentiable functions. Let f(x) be differentiable at all 
points of a < x < b; this says that for any (o, for any x in (a, b), there is a positive 8 (o>, x) 
such that for I h I < 8 . 

\f(x + h)-f(x)-hf (x)\<(o\h\. (1) 

Then (a, 6) can be divided into a finite set of intervals (x r , a; r+1 ) such that for all x of (x r , x r+1 ) 


I /(*) -Mr) -ix- Zr) f'(Zr) | < « | Zr |, 

Z r itself being a point of (x r , x r+1 ). 

For any fixed y (1) remains true if all 8(<o, x) are restricted to be < y. Then all x r+1 —x r 
will be < 2y. 

Heine* proved that a continuous function is uniformly continuous (1*071) by what was 
essentially a method of Dedekind section, capable of being used to prove the general 
Heine-Borel theorem and so used in Lebesgue’s proof. The specific use of overlapping 
intervals is due to Borelf, the form of the Heine-Borel theorem given here to W. H. Young.J 
The bisection method was used by Bolzano; Goursat (see 11*043) used it in an important 
simplification of the conditions for Cauchy’s theorem, in which he recognized the effect of 
the restriction when each section is required to contain a point x with which the I x covering 
that section is associated. He did not, however, give the general form of the modified 
theorem or comment on the possibility of proving the main theorem by the same method. 
This was first done by H. F. Baker in a note reported in title only.§ 


1*063. A function continuous in a^x^b is bounded in a^x^b. Take an arbitrary 

positive e. For every x of (a, b) there is an interval I x = (x— 8 X , x + y x ) such that for every Z 

of this interval . r/ „ v „. 

|/(£)-/(*) | <e. ( 1 ) 

Here 8 a = 0, y b = 0, otherwise 8 X , y x > 0. Therefore for every Zi> Z 2 °f this interval 


I f(Zi) —fiZz) | < 2e. (2) 

Then we can divide (a, b) into a finite number, say n, of intervals (x r , x r+1 ) such that for all 
Zn> Zr 2 of each interval, including the end-points, 


I Mn) —fiZrz) | < 2e. (3) 

Hence for any x of (a, b) | f(x) -/(a) | < 2 ne, (4) 

and therefore f(x) is bounded (above and below) in a ^ x < 6. 


* J. nine angew. Math. 74, 1871, 188. 

| The J’s necessarily overlap; for if I x contained no points interior to any other, its end-points would 
not be interior to any I at all. 

t Proo. Lond. Math. Soc. (1) 35, 1903, 384-8. 

§ Proo. Land. Math. Soc. (1) 35, 1903, 459. 
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It follows that f(x) has upper and lower bounds in (a^x^b). If/ (x) has upper bound M 
and lower bound m in an interval (irrespective of whether/( x ) is continuous) we call M m 
the leap of fix) in the interval.* 


1*064. A function continuous in a closed interval {a, b) attains in that interval its upper and 

lower bounds in (a, b) and every value between them. 

Let m, M be the lower and upper bounds of/(z) in {a, b). Let c be any value not taken by 
f(x) in (a, b). Then for any x of (a, b) there is an interval I x = {x — 8 X , xJr ^x)’ such that or 
all £ common to (a, 6) and I x 

!/(£)-/(*) i<* I/(*)- c I 

since f(x) is continuous and | f(x) — c [ positive. Hence at points common to (a, b) and I x 

\M)~c\>$\f{x)-c\>0 

and M)-c has the sign of f(x)-c. Therefore, by the Heine-Borel theorem, (a,6) can be 
covered by a finite set of overlapping intervals I x , say I Xl , ...,I Xm . Therefore (1) the lower 
bound of i /(£) - c | in (a, b) is > \ | f{x r ) - c | for some r; none of these is zero and therefore 
c is not the upper or the lower bound of f(x) in (a, 6); (2) /(£) — c preserves the same sign 
throughout (a, 6) and therefore c is not between m and M. 

The need for the restriction to continuous functions is made clearer by considering the 
function f(x) = a: ((Kz<£), f(x) = 0 (| < x < 1). This has upper bound but f{x) is never 

equal to \. ? 

In 1-063 and 1-064 the interval must be closed. Take f(x) = ljx in 0<«< 1; this is 
continuous at every point of the interval but is unbounded. lff(x) = x for 0 < x < 1, the 
upper bound is 1, since for any 7 } < 1 there is an x < 1 such that f(x) > y, but f(x) is not 
equal to 1 for any x<l. 

1*065. Increasing and decreasing functions. A function is called increasing in an 
interval a<x<b if for any x lt x 2 such that a<x 1 <x 2 < 6,/(#i) <f(x 2 ). It is called decreasing 
if, when a < x 1 < x 2 < b,f{x x ) >/(x 2 ). A non-decreasing function is one such that /(ah) </(* a ) 5 
similarly for a non-increasing function. Such functions may be constant for some parts 
of the interval; increasing and decreasing functions are nowhere constant. Increasing 
and decreasing functions are together called monotonic.' f 


1*066. Inverse functions, liy = f{x )is continuous and monotonic in a closed interval 
(a, b) it takes once, and only once, every value between its upper and lower bounds. Hence 
there is a single-valued inverse function s = g(y), which is also monotonic. (The condition 
that f(x) is monotonic is necessary, for if f(x) was constant in an interval, or if it was 
decreasing in part of the interval and increasing in another part, it could take the same 
value more than once, and g{y) would not be single-valued.) 

The inverse function g{y) is continuous. This says that, for any y ants, a given - t> <^re 
a 8 such that if | y-y | <8 then | g(y)-g(y) |<e; that is, for any a and given e there is 

* The name oscillation is in use. This strikes us as unfortunate because it is applied to functions 
that do not oscillate; when we describe a sequence as oscillatory there is some resemblance to what a 
physicist means by oscillation. SaUus is used by Hobson and leap by Newman. 

+ In many works what we call an increasing function is called a strictly increasing function, and 
what we call a non-decreasing one is called an increasing one. Similarly what we have called 
monotonic function is often called a strictly monotonic one. 
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a 8 such that if [ /(£)—/(#) | <8 then | £— x j <e. To prove this we take for definiteness 
f(x) to be increasing. Whenever < x — e and £ 2 ^ x + e we have 

Mi) - e) <M) <f(x + e) *$/(£ 2 ). 

Then if j /(£) —f(x) | is less than the smaller of j f(x) —f(x — e) | and \f(x)—f(x + e)\ it 
follows that 

x — e<E,<x + e. 

A many-valued function can often be regarded as a set of single-valued ones. Thus for 
any x > 0 there are two admissible values of *Jx. But if we agree to take always the positive 
root or always the negative one we get in either case a single-valued continuous function 
of x. The theorems for continuous functions will then apply to either of these separately, 
but having decided which to take we must not change our minds. 

1*07 . Uniformity of continuity. In general if we choose 8 so that 

\f(x+h)-f(x)\<e if \h\<8, (1) 

for some particular value of x, it will be found that for some other values of x and the 
same e the inequality will not be satisfied for the same value of 8. For instance, let 

f{x)=x 2 (0<a;<l). (2) 

If x = 0, (1) will be true if 8 — Je. But ifx—1 

| (1 — ^)2— l | = j2h-h*j 

which will not be less than e if h is, say, and e is small enough. But if we take 8 = fe, 
(1) will be true for all a; in the range. 

This brings us to the idea of unif ormity, which we shall meet again and again. (1) specifies 
an inequality that is satisfied for some 8 for every e and x, but 8 for given e may depend on 
x, and can be written 8(e, x). A proposition (here | f(x 4- h) — f(x) | < e) is said to hold uni¬ 
formly with regard to a variable (here x) if a condition for its truth can be stated so as 
not to depend on x; thus here j h j < 8(e), where 8(e) may depend on e but not on x. 


1*071. A continuous function is uniformly continuous in any closed interval. In the 
argument of 1-063, which applies to any value of e, replace e by (o and let $8 be the length 
of the shortest interval (x r , * r+1 ). Then any two points £ ls £ 2 of (a, b ) such that | £ 2 — £i I < £ 
must belong to the same or adjacent intervals, and therefore 


Take <o = |e; then there is a 8 such that whenever £ 2 are points of (o,6) satisfying 


I&-&I <9 


I/(&>-/&) I <«. 


1*08. Orders of magnitude. If as x tends to a limit <p(x) tends to 0 or oo, and f(x)J<p(x) 
is bounded, we say that/(a) = O{0(x)}, or that f(x) is of the same order of magnitude 
as <f>(x). Iff(x)l$(x) ->0 as <p(x) -+ 0 we write f(x) = o{0(*)}. If f(x) is bounded we can write 
f(x) = 0(1). This notation must be distinguished from the common usage in physics, 
where we may say that the masses of Jupiter and Saturn are of the same order of magni¬ 
tude, meaning roughly that they differ by not more than a factor 10 without there being 
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Functions of bounded variation 

any question of a limit. In the physical sense the quantities compared must have the same 
dimensions. This is not necessary in the mathematical sense, x may, for instance, be a 
time-interval and f(x) the distance travelled by a sound wave. Then f(x) = 0(x) because 
f{x)/x is the velocity of sound and is supposed finite. 

Note that 0{x m ) 0(x n ) = 0(x m+n ), o{x m ) 0(x n ) = o(x m+n ). 

1-09. Functions of bounded variation. If the function f(x) is defined in the closed 
interval ( a,b ), and there is a number M such that 

v = I f( x i) ~f( x o) I + I f( x 2 ) ~f( x 1 ) I + • • • + | f( X n) ~f( x n- 1 ) ! ^> 

for every subdivision a = x 0 < x x < x 2 < ... < x n _ x <x n — b, f(x) is said to be of bounded 
variation in the interval {a, b ); and the upper bound of the sums v for all possible selections 
of the subdivisions is called the total variation of f(x) in the interval.* The total variation 
is of interest since it is related to the condition for existence of a Stieltjes integral (1*102), 
and to the existence of the length of a curve, and it is useful in the theory of Fourier series 
and Fourier integrals. 

We assume repeatedly that the sum and product of two continuous functions (and 
therefore of any finite number) are continuous, and that those of two functions of bounded 
variation are of bounded variation. The proofs are simple: for the last, notice that 

f( x r+l)9( x r+l)-f( x r)9( x r) = W( X r) +f( x r+l)} {9( x r+l) ~ d( x r)} 

+ l{g{ x r) +g( x r+ 1 )} {f( x r+ 1) ~f( X r)} 

and it follows that if M, N are the upper bounds of | f(x) \, \ g(x) |, and U, V the total varia¬ 
tions of f(x),g(x) in the interval, the total variation of f(x) g(x) is not greater than MV + NU. 

Note that it is not satisfactory to define the total variation as the limit of the sum given, for 
there may be no limit for some ways of ma kin g the subdivision, or different ways may give different 
limits. Take for instance 

/(«) = (f(0<a;<£); /(£)=1; f(x) = 0(i<a;< 1). 

But the limit, if it exists, does give the total variation if the function is continuous or monotonic. 


1*091. If a function has bounded variation it need not be continuous, or conversely. For 
if f(x) — 0 for x ^ 0 and = 1 for x > 0, the variation does not exceed 1 in any interval; 
but f(x) is discontinuous. Conversely, iff(x) = a; cos 1/x for x^0, and /(0) = 0, 

f (—) = — (~l) n . 

J \mr) rnr ' 

The variation between x — — and x = -—— is therefore at least —- + 7 — \ . - , and 

rnr ( n+\)n rnr {n+l)TT 

that between x = ljn and x = Ijnn 


1 


7T 


(' 


2 2 

H-1-1-... + 

2 3 


2 

n— 1 



which tends to infinity with n. Hence f(x) has not bounded variation. But f(x) is seen to 
be continuous even at x = 0, since | f(x) | < x, /(0) = 0. 


* Some authors use fluctuation instead of variation. 
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1*092. Any function of bounded variation in (a, b) is the difference of two bounded non¬ 
decreasing functions. 

For any closed interval (a, x) we consider the sum 

P = 2{/(a r )-/(»V-i)}, 

where x 0 = < x k _ x ^ x k — x 

taken over all terms f(x r ) —f(x r _ x ) that are positive, and 

-n = Z{f(x r ) 

taken over all negative terms. The upper bound of p over all possible subdivisions is the 
positive variation P(a,x) in ( a,x ) and the upper bound of n is the negative variation 
N(a,x). Let v = p + n\ then the values of v are a bounded set since they are all < V(a, b). 
Their upper bound V(a, x) is the total variation of f(x) in ( a,x ). Also by taking upper 

bOUnds V{a, x) = P(a, x) + N(a, x). 

Evidently P(a,x), N(a,x), V{a,x) are all non-decreasing functions of x and are bounded 
in (a, 6). 

For any subdivision and for any fixed x 

p — n = f(x) —f(a), p+n = v. 

Hence p = \v + #f{x) -f(a)}, n = \v- \{f(x) -f(a)}. 

Take upper bounds over all possible subdivisions; then 

P(a, t) = \V (a, x) + i{f(x) —f(a)}, N(a,x) = \V {a, x) - J{/(x) -/(a)}. 

Hence f(x) = {/(a) + P(a, x)} - N (a, x), 

so that f{x) is expressed as the difference of two bounded non-decreasing functions. 

1*093. All discontinuities of a function of bounded variation are simple or removable. The 
characteristic feature of a simple discontinuity at x = a is that f[a —) and f{a +) exist 
and are different. That of a removable discontinuity is that they exist and are equal, but 
not equal to/(a). Suppose if possible that one of them does not exist; that is, there are 
two quantities M, m (M > m) such that in any interval, however short, on one side of a 
there are points where/(a;) > M and points where/(«) < m. Let be a point where/( x) > M. 
Then there is a point between a and k, say £ 2 , where f(x) < m; then there is a £ 3 between 
a and £ 2 where/(a:) > M, and so on. It follows that the total variation in the interval ( a , £ x ) 
is unbounded, and therefore f(x) is not of bounded variation. 

Alternatively, let f(x) be of bounded variation in (a, b) and consider the positive varia¬ 
tion P{x) in (a, x). This is a bounded non-decreasing function of x and therefore has limits 
(not necessarily equal) as x->c (in (a, b)) through larger or smaller values. The same holds 
for the negative variation. Hence by subtraction/^) has limits as x->c through larger or 
smaller values, and therefore c is either a point of continuity or a simple or removable 
discontinuity. 

If/(c +) exists we may speak of the variation in a half-closed interval (c, d) on the right 
of c, meaning the variation of g(x) in (c, d), where g(c) =/(c +) and otherwise g{x) =f(x). 
Then this variation tends to zero as d-+c. Similarly, we can define a variation in an interval 
on the left of c, with the same property. 


26 Integration 1*094-1*101 

1*094. Leap at a discontinuity. Let f(x) be discontinuous at a but bounded in an 
interval including a as an interior point. Then for some positive S,f(x) has upper and lower 
bounds M , m in (a - 8, a + £). If S' < S, the upper bound in (a - S', a + £') cannot be greater, 
or the lower less, than in (a — S,a + S). Hence the leap in (a — S,a + S) has a non-negative 
limit as S-> 0. If this limit is zero the function is continuous at a; if positive, we call the 
limit the leap of the function at a. 

Iff(x) = 0 (x< 0),f(x) = 1 (x> 0), the leap at 0 is 1. If f(x) = 0 (z*0 ),f(x) = 1 (x = 0) 
the leap at 0 is again 1. Iff(x) = sin 1/x, the leap at x = 0 is 2, since values arbitrarily 
near 1 and — 1 occur in any interval about 0. 

1*10. Integration : Riemann, Stieltjes . Two different definitions of an integral will 
be used in this book. 

Let x x , x 2 ,... x n be a set of increasing values of x between a and b, subject to all 
x r +1 ~ x r < 8 (we take when convenient x 0 = a, x n+1 = 6). Take in each interval a £ r , so that 
x r+i> an< i form the s um 

~Mo) ( x i~ a ) +/(£i) ( x 2~ x i ) + ...+/(£„) (b — x n ). (1) 

This sum will depend both on the values chosen for the x r and on those for £ r , unless f(x) 
is constant; but if we take a sequence of values of S tending to zero, taking at every stage 
x r and £ r in accordance with the inequalities, and form the sum S n for each, these sums may 
tend to a limit, and this limit may be independent of the choice of the x r and £ r at each 
stage. If so, this limit is called the Riemann integral and denoted by 


£ 


f(x)dx. 


( 2 ) 


It is also possible to integrate with respect to a function. If f(x) and g(x) are both 
bounded functions of x, we form the sum 

s n = Mo) {gi x i) - g(a)} +Mi) {9( x *) - g( x i )}+• • • +MJ &(&) - g( x n )}> (3) 

fh® being chosen as in (1). If this sum tends to a unique limit when the greatest interval 
of x tends to zero, the limit is called a Stieltjes integral * and denoted by 

rb 

f( x )dg(x). ,rj P (4) 

IB H 


l 


The method of writing the termini needs attention because g(x) may not be monotonic. 
It might return to its original value, but we must not write the range of integration as 
g(a) to g(a), which would apparently make the integral zero. It is x, not g(x), that is 
required to increase steadily throughout the range. 

C b 

1*101. The Riemann integral f(x) dx exists if and only if f(x) is bounded in (a, b) and, 

J a 

for any positive values of oj and rj, (a, b) can be divided into a finite set of intervals such that 
those where the leap of f(x) is > oj have a total length < rj. 

First, it is clearly necessary to the existence of the integral that/(a:) shall be bounded. 
For if f(x) is unbounded in (a, 6) there is always at least one interval (x r , x r+1 ) where it is 

* T. J. Stieltjes, Ann. d. Fac. d. Sciences, Toulouse, 8, 1894, J., 68-75; also D. V. Widder, The 
Laplace Transform, 1941, Chapter 1; S. Pollard, Q. J. Math. 49, 1923, 73-138. 
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unbounded; and therefore the possible values of S n formed with different choioes of 
in that interval are unbounded. Hence, however we choose the intervals, S n cannot tend 
to a unique limit irrespective of the choice of £ r at each stage. 

Suppose then that the upper and lower bounds of f(x) in (a, 6) are M, m. Suppose also 
that the upper and lower bounds in ( x r , # r+1 ) are M r , ra r , so that for any choice of we have 
m r < f(ir) ^ Mr* Form the sums 

h n = 2 m r (x r+x x r ), H n = 2 M r (x r+X x r ). (1) 

These will be called the lower and upper sums for the subdivision specified by the points 
x r are the lower and upper bounds of S n for that subdivision. 

Now in any interval (x r , x r+1 ) there will be a value of x wher ef(x) ^ + %m r and a value 

where f(x) < + Hence if M r —m r ^to it will be possible to make such choices of 

ir that the corresponding values of /(£ r ) (x r+x — x r ) differ by at least \oj(x r+x —x r ). Then, 
since the choices of £ r in all intervals are made independently, if the intervals where 
M,, — m^a) have total length ^ rf, where rj is positive, we can make two sets of choices of 
the £ r in each interval such that the corresponding values of S n differ by at least \oyrj. 
If then there are oj> 0, ij> 0 such that for any subdivision of (a, b) the total length of 
intervals where the leap of f(x) is ^ <w is always at least rj, S n cannot have a unique limit. 
Hence the condition is necessary. 

Since m r < M, M r ^m,'we have always 

h n ^M(b-a), H n ^m(b-a). (2) 

Hence the values of h n given by all possible subdivisions have an upper bound, say h; 
and the values of H n have a lower bound, say H. We show first that h^H. 

If in any interval (x r , x r+1 ) we insert a further point of subdivision, say x rl , and again 
form the lower and upper sums, the upper bound of f(x) in either part may be less than 
Mf but cannot exceed it. Hence insertion of new points of subdivision may decrease the 
upper sum but cannot increase it; and similarly may increase the lower sum but cannot 
decrease it. 

Now consider two different modes of subdivision specified by points x r , x 8 . Let the 
respective sums be H n , h n , H p , h' p . Consider the subdivision formed by taking all the points 
of both subdivisions together. Let the sums for it be H q , h q . It may be regarded as a sub¬ 
division of either the x r or the x' 8 set. Hence, by the last paragraph, 

H n >H” q >h' q >h' p . (3) 

Thus it is impossible for any lower sum to exceed an upper sum, and therefore for all n, 
H n ^h and therefore H^h. 

Again, if we can find a method of subdivision such that H n — h n <e, it will follow that 
H — h<e; for H n — h n = (H — h) + (H n — H) + (h — h n ), and H n — H^ 0, h — h n ^0. Now 
suppose that the intervals are classified into A intervals, where M r — m r < &>, and I? 
intervals, where M r — m r ^(o. In the B intervals we still have If a is the 

total length of the B intervals, we have 

H n — h n <(b — a — a)(i) + (M—m)a . (4) 

Assume now that for any o) the total length of the B intervals can be taken arbitrarily 
small. Then for any positive e we can take a), a so that 

(b—a)ax%e, (M—m)oL<\e. 


(5) 
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h<e, and therefore, since H, h are independent of e, that 

H = h. (6) 

We have still to show that if we take methods of subdivision such that the length of the 
longest interval is 8, and we make 8~>0, then H n ->H, h n ->h = H. Let x r give a set of 
points of subdivision satisfying (5), so that H n — h n <e. In this subdivision let the shortest 
interval be 8, and consider another subdivision by points x' g such that the longest interval 
is less than 8. Let this give upper and lower sums H' p , h p . Then any consecutive points 
X 8> x 8+i either belong to the same interval of the x r set or to two adjacent ones. If the latter 
are both A intervals the leap is less than 2&>; if both B intervals, or if one is an A and one a 
B interval, it cannot exceed M—m. But if a B interval is of length the length of the 
x'g intervals that have common points with it cannot exceed p + 28^ 3 p. Hence the leaps 

°f f( x ) io the x' s intervals are < 2o) except possibly in a set of total length < 3S/« = 3 a. 
Thus 

Hp—hp<(b—a — 3a)2a) + 3(M — m)a,< fe. (7) 

Also H p ^ H,h p ^ H; hence 

— H — h' p <%e. (8) 

Since this is true for all subdivisions such that the longest interval is less than 8, the result 
follows. 

Finally, since h n ^S n ^ H n , S n also tends to H. 

1*1011. The condition is due to du Bois-Reymond. It can be stated in an alternative 
form, which is sometimes more convenient. A necessary and sufficient condition for the 

existence of j f(x) dx is thatf(x) is bounded and that for any positive o), tj the discontinuities 

where the leap is ^ a) can be enclosed in a finite set of intervals of total length < rj. Du Bois- 
Reymond’s condition clearly implies this. Conversely, if the condition just stated is 
satisfied, there are no discontinuities where the leap is > (o in the remaining intervals. 
Then about any point in the remaining intervals there is an interval where the leap is 
< (o. Hence, by the modified Heine-Borel theorem, the remaining intervals can be divided 
into a finite set such that the leap is < a> in each. 

1*1012. An immediate consequence is that any continuous function has a Riemann 
integral; for it is bounded and has no discontinuities at all. Also any function with a finite 
number of finite discontinuities has a Riemann integral. The same applies to any function 
of bounded variation. For if, for some a), there were an infinite number of discontinuities 
where the leap is greater than o), it would not be of bounded variation. 

Note that the condition does not require the number of discontinuities to be finite. 
Take f(x) = 1 when x = 1 jn, where n is any positive integer, and otherwise zero. This is 
discontinuous whenever x = 1/n, and also at x = 0. But for any rj the interval (0, \rj) 
contains an infinite number of discontinuities, and the remainder, with \rj^x^ 1, are 
finite in number and can be enclosed in intervals of total length \rj. Thus an infinite set of 
discontinuities can sometimes be enclosed in a finite set of intervals of arbitrarily small 
total length. 

If f( x ) = 0 for x irrational and f(x) = 1/n for x = mjn, where m/n is a proper fraction in 
its lowest terms, f(x) is discontinuous at all rational values of a; in (0,1), but continuous 


28 

It follows that 0 ^ H — 
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at all irrational values. For any irrational x 0 can be enclosed in an interval of length 1 /»! 
containing no rational fraction with a denominator less than n, and therefore the values of 
f(x) in a sufficiently small interval about x 0 will be arbitrarily small. In this case there is 
a discontinuity of f(x) in every interval, however short. Nevertheless, it has a Riemann 
integral; for the number of the discontinuities where the leap of f(x) exceeds e is not more 
than the sum of the integers less than 1/e, and is finite. The integral is, in fact, zero. 

If/(#) = 1 for x rational and = 0 for x irrational, then for every x 0 , rational or not, there 
are values of x arbitrarily near x 0 where f(x) = 1 and others where f(x) = 0. Hence every 
value of £ is a discontinuity where the leap is 1, and those in (0,1) cannot be enclosed in 
any set of intervals of length < 1. In this case H n — 1, h n = 0, however we subdivide the 
interval. 

Such types of irregularity are of little direct practical importance, but they have an 
indirect importance, since we are aiming at a considerable degree of generality and there¬ 
fore need danger signals. There are other definitions of an integral, especially that of 
Lebesgue, which give definite values to some integrals that do not exist in Riemann’s sense 
(including the one just mentioned); they contemplate an infinite set of subdivisions from 
the start. They simplify the statements and extend the generality of some later theorems 
appreciably. The reader is referred to the accounts given by Burkill* and Titchmarsh. j* 
But it appears that cases where these methods are applicable and Riemann’s is not are 
too rare in physics to repay the extra difficulty. 

If f( x ) has a Riemann integral, {f(x)} n (n > 0) and | f(x) | have Riemann integrals over 
the same interval. For if f(x) is bounded and the discontinuities where the leap exceeds (o 
can be enclosed in intervals of arbitrarily small total length, the same applies to {f(x)} n 
an d | f(%) |. The converse is not true. Consider f(x) = 1 at rational values of x, f(x) — — 1 
at irrational values; {f(x)} 2 and j f(x) | are integrable, f(x) is not. 

1*1013. ‘Measure zero *: ‘Almost everywhere ’. A set of points capable of being 
enclosed in intervals whose total length is arbitrarily small is said to have measure zero, 
and a proposition true except at such a set is said to be true almost everywhere. Any finite 
set of points has measure zero; so also have the integers, since we can enclose each integer 
n in an interval 2 -lnl a, where a is arbitrarily small, and 2 2 -,n| converges. So have the 
rational numbers in (0,1). For if p and q are integers with p<q we can enclose pfq in a 
range of length ajq z , where a is positive. There are q— 1 fractions wdth denominator q, 
excluding 0 and 1. But 0 and 1 can be enclosed in ranges a, and the other fractions in a 

range less than ccjq 2 . Summing now with regard to q we see that all rational fractions can 

00 

be enclosed in ranges of total length less than 2a + a 2 q~ 2 ‘, the series converges and there- 

2 

fore the total length can be made as small as we like by a suitable choice of a. The same 
holds for any enumerable set. 

Consider a decreasing sequence of positive quantities w ls w 2 ,... tending to zero. If f{x) 
has a Riemann integral the points (if any) where the leap is ^ oj can be enclosed in a finite 
set of intervals of arbitrarily small length; hence the discontinuities where the leap is 
<(0 n-i but ^ (o n can be enclosed in a finite set of length 2 ~ n ij, and all discontinuities in 
a set of length rj. This set of intervals is enumerable, since each can be reached in a finite 

* J. C. Burkill, Cambridge Mathematical Tracts , No. 40, 1951. 
t E. C. Titchmarsh, The Theory of Functions (1932), Chs. X, XI, XU. 
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number of steps from the start; hence the discontinuities of an integrable function can 
be enclosed in an enumerable set of intervals of arbitrarily small total length. These 
intervals may overlap. 

1*102, Existence of Stieltjes integral. The definition in 1* 10 of this type of integral 
allows the function g(x) to be discontinuous. If g(x) is non-decreasing and bounded for 

a^x^b and iff(x) is also bounded, a necessary and sufficient condition that J* / {%) dg(x) 

shall exist is that for any o), 8 the interval can be divided into a finite number of sub¬ 
intervals, such that in the intervals where the leap of f{x) is greater than (o the total 
variation of g(x) is less than 8. The proof is substantially as for the Riemann integral. If 
g(x) has bounded variation the same result follows by expressing g(x) as the difference of 
two non-decreasing functions <fi(x) — ijf(x) and considering jf(x)d<fi(x) and J f(x) dij/(x) 
separately. 

In particular the Stieltjes integral exists in any finite interval if g(x) has bounded 
variation and f(x) is continuous. It does not exist if f(x) and g(x) have a discontinuity at 
the same value of x, for in any interval including the discontinuity neither the leap of 
f(x) nor the total variation of g(x) is arbitrarily small. It follows that it is not sufficient for 
the existence of the Stieltjes integral that f(x) and g(x) shall both be of bounded variation. 

We shall not give general conditions for the existence of the Stieltjes integral when g(x) 
is not of bounded variation; we shall show that it is sufficient that g(x) shall be continuous 
and f(x) of bo un ded variation, but it is not sufficient that both shall be continuous. 

If a < b < c, and we write I(d, e) = f f(x) dg{x), then if I {a, c) exists both I(a,b) and 

J x—d 

I(b, c) exist, and their sum is I (a, c). The converse is not always true. If 


/(*) = 0 (*< 0 ), 
= 1 (#>0), 


g(x) = 1 (x<0), 

= 0 (x^0). 


j* fdg and f fdg both exist and are zero, but f fdg does not exist. The converse 

J*=-i o . “ _1 . . . r T, n J 

is true with a slightly different definition of the Stieltjes integral given by Pollard. 


1*103. Differentiation. 


[X 

(a) Iff(x) is continuous and f(u)du = F(x), 

J a 


then 


Tx F( X ) =/(*), 


and F(x) is a continuous function of x. 
This is almost obvious. 


(b) If 


A 

dx 


F(x) = /( x), 


and f(x) is integrable, then j: f(u)du = F(x)—F(a). 
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Since F(x) is differentiable in (a, b), we know from 1-0622 that for any positiye o), 8 we 
can divide (a,b) into a finite set of intervals (x r , # r+1 ), all of lengths C 8, each containing a 
point I, such that for every point of ( x r , x r+1 ) 

| F(x)- Ffo) -{x-QF'&) \<(o\x-^l ( 1 ) 

and therefore | F(x r+1 ) - F(x r ) - {x r+x - x r ) F\g r ) \ < (o(x r+1 - x r ). (2) 

By addition | F(b) - F(a) - 2 (x r+1 - x r )f(g r ) | < w(6 - a). (3) 

Since f(x) is integrable we can, given any positive e, choose 8 so that the sum in (3) 
differs from the integral by less than e. Hence 


F(b)-F(a)~ ( b f(x)dx 
J a 


<e+(o(b—a) 


and therefore is zero, since e and w are arbitrarily small. 

Note that it is possible for F(x) to be differentiable and for its derivative not to be 
integrable; for example, 

F(x) = x 2 sin^ (Oca;Cl), jF’(O) = 0. 

00 

The derivative exists even at x = 0, but is unbounded in any neighbourhood of 0. 

( c ) If f( x ) has a Riemann integral f f(x)dx, then |* f(u)du exists and is continuous for 

J a J a 

all x such that a C x C b\ and its derivative is equal to f(x) except possibly at a set of measure 
zero, namely, the points of discontinuity of f{x). 

Let a; be a point of continuity of f(x). Then in an interval (x — h,x+h) the leap of /(£) 
is a), where w -> 0 with h. Also 

i C x+h i (c x+h r x \ 

f(x)f(u) du = ^ | J ^ f(u) du - J J(u) duj C/(s) 

Making h-> 0 we have 

fjmdu =j(x) 

at all points where f(x) is continuous. 


I + o). 


It follows that if 


Cx rx 

I f(u)du = g(u)du a^x^b 
Jo Jo 


then f(x) — g(x) almost everywhere in a C x C b, the exceptional points, if any, being at 
points of discontinuity of f(x) or g(x). 

The exception in (c) is of some importance. If, for instance, f(x) = 0 for x^0, and 
= 1 for x > 0, 

F(x) = J f(u) du = 0 {x C 0), 

= x (x> 0), 

and does not exist at x = 0. Again, if f(x) = 0 for x =1=0 and = 1 for x = 0, 

f(u)du = 0 over any interval and has derivative 0 everywhere; but this derivative 
is not equal to f(x) when x = 0. 


t. 





32 Integration by parts 1*1031-1*1032 

1*1031. Integration by parts for Stieltjes integrals. We define 

s n = ifi&bM-OiXr-J} (!) 

r— 1 

»o = ®) x n = b, XqX x ^ ... ^ X r —i ^ x r ... ^37 w . (2) 

Then S n = f fan) 9 fan) ~ffao) 9fao) ~ ( 3 ) 

where S n = g (x 0 ) {/&) -f(x 0 )} +^ g(x r ) {/(^ + i)-/&)}■+ p(* n ) {ffan)~Mn)Y W 

r=l 

We assume that I = I fdg exists; that is, for any e > 0 we can choose 8 so that for all 
J x=a 

subdivisions such that the greatest subinterval is< 8 

\S n -I\<e. (5) 

Then for any set a,£ l5 £ 2 , such that £ x -a ,£ r+1 - f r ,..., 6 - £ n are < x r satisfying 

the inequalities (2) will also satisfy x r — x r _ x < 8 for all r. Hence for such a set 

\X n -f(b)g(b)+f(a)g(a) + I\<e, (6) 

and therefore S n tends to a limit as 8-> 0, and this limit is by definition jgdf. Hence jgdf 
exists and 


rb r ~i 6 c b 

9df = fg - fdg. 

J x=a L -Ja J x—a 


In particular, since jfdg exists when / is continuous and g is of bounded variation, it 
also exists when / is of bounded variation and g continuous. If g(x) is a Riemann integral 
it is both continuous and of bounded variation; hence jfdg exists if / has either property. 
For Riemann integration the result is usually stated in the form 

f ffa)g'fa)dx = |~/(a:)sr(a:)l - f f’(x)g(x)dx, (8) 

J a L Ja J a 

thus apparently requiring both/and g to be differentiable for all a < x < b. If the derivatives 
exist and are integrable this follows immediately from (7) and 1*1032. But (7) is true 
under much wider conditions. Incidentally the easiest way of using (8) is to integrate g' first 
and rewrite (8) in the form (7). 

cv 

1*1032. Change of variable in an integral. If x = h{y) =J g{u)du, and if 

I = f f(x)dx, J = f ffa)g(y)dy 

Jy=a Ja 

both exist, then I = J. Since both integrals exist by hypothesis, it is enough to prove that 
the partial sums tend to the same limit for some way of forming them. Take x r = h(y r ), 

Sr = HVr), A-S/f&Hav+x-av). 4 = S/(&)?(7.) (»«.!-»,)• I 1 ) 


Since g(y) is integrable, the intervals of y can be chosen so that those where the leap of 
g(y) is greater than (o have total length ^ 8, where a), 8 are arbitrarily small. Also J g{y) J 
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is bounded, say < G; hence if the greatest interval y r+1 — y r is less than A, the greatest 
I x r+i — x r | is less than GX. Hence J n ->J. Now if G r , g r are the upper and lower 

bounds of g{y) for y r ^y^y r+1 . 


and therefore 


9r(Ur +1 Vr) ^ x r+l ~ x r ^ ^r(Vr +1 ~ 2/r)> 
9r^9(Vr)^G r , 

| « r+1 - * r - g( Vr ) (y r+1 -y r ) | < (G r - g r ) (y r+1 - y r ), 
I ~ In ! ^ ^/(£r) (®r — 9r) (Vr+ 1 — Vr)‘ 


( 2 ) 

(3) 

(4) 

(5) 


Let the leap of g(y) in the whole interval be N; and let G r -g r be ^ w except in a set of 
intervals of total length 8. Then 


\I n -J n \^FN8+G)F(b-a-8), (6) 

where F is the upper bound of | f(x) |. This is arbitrarily small; hence 

In~d n -^0, I = J. (7) 

The usual form J(x)dx = i”f(x)^dy, ^ >0 (8) 

Ji(o) Ja ay dy 

is somewhat less general because it assumes dxjdy to exist everywhere. If this condition is 
satisfied the theorem is proved very easily by an application ofRolle’s theorem (M3). But 
the more general form is needed for transformation of integrals along curves, which may 
have comers where there is no definite tangent. (8) can be made valid if at any point c 
where dxjdy does not exist, we understand it to be replaced by any value between the 
limits, as 8~> 0, of the upper and lower bounds of dxjdy in (c — 8, c + 8). 

1*104. Infinite and improper integrals. The proof of the existence of an integral 
breaks down if either the interval b — a is infinite or the function to be integrated is 
unbounded in the interval. In the former case, b-a is infinite and we cannot make 
(0> 0> (b —a)axe by any choice of a>. In the latter the approximating sum may vary to 
any extent according to the point chosen to sample f(x) in the subinterval where f(x) is 
unbounded. A special device is needed in either case to give a meaning to the integral. 
The method used for integrals with an infinite upper limit is to use first an integral with 
a finite upper limit; if the integral tends to a definite limit when the upper limit tends to 
infinity this limit is taken as the value of the infinite integral. The need for such a device 
may be illustrated by the integral 

Jo x 

According to our rule this must be interpreted as 


The integral u] 
we take Y>X 
less than Yjn , 


imf 

->■00 J ( 


lim 

x 


sin a; 


o x 


dx. 


p to X exists for all X since the integrand is everywhere continuous. If 
, m to be the integer next greater than Xjn, n to be the greatest integer 


r^ dx _ r^ dx+ r *«*+■£* r 

j* * Jx X ) nn x rfjJ, 


(r+i)jr gin # 

x 


dx. 


JMT> 


3 





34 Infinite integrals 1*1041-1*1042 

The first of these integrals, since | sin a; | ^ 1 and mn — X^rr, is numerically < njX . 
Similarly, the second is numerically ^ ljn. The sum consists of alternately positive and 
negative terms, each less in magnitude than the preceding; and we have the theorem that 
if u 0 >%>...> u n > 0 

u 0 >u 0 — u 1 + u 2 +l) n u n >0. 


Hence the sum is less numerically than its first term, and 


|'(m+i)w sinz 

J mv X 


dx 


1 

< —. 
m 


Thus 



7 t 1 1 

< H-1— < 

X n m 


Sn 

X’ 


which can be made arbitrarily small for all Y > X by taking X large enough. Hence for 
any positive quantity e, however small, we can choose X so that no matter how much we 
increase the upper limit beyond X we cannot change the integral by more than e. Thus 
the integral up to X has a definite limiting value as X tends to infinity, and the infinite 
integral exists in the sense defined. 


1*1041. Since an integral is a function of its upper terminus, we can adapt the tests 
for convergence of a sequence given in 1*0441 and 1*045 on the lines already mentioned in 
the theory of continuity (1*06). The proofs are straightforward. 

If fix) > 0 and f f(x) dx is bounded for all X> a, then f f(x) dx tends to a limit as X -> oo. 

J a j a 

rx 

A necessary and sufficient condition that J fix) dx shall tend to a limit as X -»■ co is that for 

J a 


c x 

any 'positive e, however small, there is an A such that J f(x) dx 


<efor all X> A. 


is unbounded as X->co. 


• oo as a;->oo. 


1*1042. The relation between infinite integrals and series is so close that the same 
words are convenient to express the properties: 

co CX 

f(x) dx is convergent if lim f(x) dx exists. 

J d X —^ co J ct 

/*oo I /*X 

f(x) dx is unbounded if f(x) dx 
J a \ J a 

f f(x) dx = oo if f f(x) dx 

*/ Cl */ & 

/*oo 

fix) dx \b finitely oscillatory if there are positive quantities o), M such that for any X 

a I r :Ft I rr. 

can choose Y x > X so that J f(x) dx > oj, but cannot choose F 2 so that | J ^ f(x) dx 

Examples of convergent integrals are 

f°° dx f® _ j f® sin# f ® sin x 
—, e~ x dx, —^-dx, - dx. 

Ji x 2 ’ Jo J l x 2 Jo x 


we 
>M. 
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Unbounded integrals are 

r^dx i * 00 . 

| ax, —, xsmxdx. 

J i Ji z Jo 

The last of these would usually be called 'infinitely oscillating ’ but we have no occasion 
to make this distinction. 

Finitely oscillatory integrals are 

J OO poo 

coso :dx, J sin a; da;. 

Unbounded and finitely oscillatory integrals have no definite values. 

The integral J f(x)dx is called absolutely convergent if f | f(x) j dx is convergent. If 
J a J a 

the former integral is convergent but the latter is not, the former is called conditionally 
convergent. Of the above examples of convergent integrals, the first three are absolutely 
convergent, the last conditionally convergent. 


1*1043. If f(x) is positive and non-increasing for x>x 0 , the integral I = j f(x)dx 

_ J X 0 


converges if and only if the series 2 f(n) converges; where n 0 is the integer next qreater 

n=n 0 * 

than x 0 . For clearly neither the series nor the integral can converge unless f(x) ->0; 
take an integer m>x 0 and such that/(m) < e. Then 


f(m) +f{m +1) +... +f(n -1)^J f(x)dx > f(m +1) +f(m + 2) +... +f(n- 1), 
where n is the integer next greater than X. Hence 


/*-X n— 1 

f[x)ix- 2/(r) 
J m r=m 


^f{m) < e 


and e is arbitrarily small. Hence if either the integral or the sum tends to a definite limit 
the other does. 


Z* 00 00 

In particular J ^ x p dx and 2 %~ v both converge if and only if p > 1. 


1*1044. Similarly, if f(x) tends to infinity at some point of the range we can define an 
improper integral by first modifying the range so as to cut out an arbitrarily short interval 
about the infinity and then making the length of this interval tend to zero. Thus 

f x-^dx = [ac^"] 1 = 2-26% lim Px-^dx = 2. 

J 6 L Je e-*0 J e 

This process is taken as the definition of x~^dx, which is not directly intelligible as 

it stands in terms of the definition of an integral as the limit of a sum. 

The analogy between infinite and improper integrals in respect of convergence is so- 
close that the nomenclature can be taken over unchanged. ° 


3-2 



36 Functions of two variables 1*1045-1*111 

1*1045. Change of variable may convert an ordinary integral into an infinite or im¬ 
proper one, but will not change its value. For instance, if for all y, x = h(y) as in 1-032, 

CHv) fy 

f(x)dx = \ f(x)g{y)dy, (1) 

J M 0 ) J 0 

when y and h(y) are finite, g(y) ^ 0, then they have the same limit if either y or h(y) or both 
tend to infinity. If h(y) b as y -> oo, and f f(x) dx exists, it is the limit of the left side 

J h( 0) 

of (1); hence the limit is equal to the Riemann integral when this exists. 

1*11. Functions of two variables. So far we have been considering sequences, which 
may be regarded as functions of one variable capable of taking only integral values, and 
functions of a continuous variable. In what follows we shall be concerned with what are 
essentially functions of two variables, which may be either integral or continuous. This 
introduces new complications when limiting processes are used, since it is not always 
obvious, or even true, that the same result will be obtained when the order of the limiting 
processes is changed. The simplest sufficient condition for the reversibility of limiting 
processes is provided by the following theorem on absolute convergence. 


1*111. If f(x, y) is a non-decreasing function of both x and y (<either or both of which may 
take only integral values), and 

lim f(x, y) = g(y), lim f(x, y) = h(x), (1) 

*->oo y-*<x> 

then lim g(y) = lim h(x), (2) 

y~> oo x-+oo 

in the sense that if either of the limits in (2) exists the other exists and the two are equal. 

Note first that g(y) is a non-decreasing function of y. For if y 2 > y x 

9(yz)-9(yi) = 11111 {f(x,y2)-f(x,yi)}> Q - ( 3 ) 

*->00 

Similarly, if x 2 > x i> M x t) ^ M^i)* (4) 

Let g{y) have a limit M. For any e there is a Y such that for all y ^ 7 

M^g(y)>M-e. (5) 

For all x,y,M ^ g(y) >f(x,y). Also X exists such that for all x>X 

f( x ,Y)>9(Y)-e, («) 

and therefore for all y > Y, x > X 

M^f{x,y)^g{Y)-e^M-2e. (7) 


Hence, i£y->co,x>X 


M>h{x)>M- 2e, 


and therefore, since e is arbitrary, h(x) also has limit M as x->co. 
We have three immediate applications. If 

m n 

f(m,n) — 2 

r=l **=1 


( 9 ) 
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where u r a is not negative, f(m,n) is a non-decreasing function of m and n. Hence for 
a double series of non-negative terms 


r a 89 


( 10 ) 

( 11 ) 


00 00 00 00 

2 2 u r , s = 2 2 u r s- 

r-ls-1 s=lr=l ’ 

If some of the u r8 are negative, we can write 

v r,s ~ | u r,8 | + u r,a> w r,s ~ | u r,s | ~ u 

and all v r >s , w r>3 are non-negative. If then ££ | u r>s | exists for one order of summation we 
see easily that v r>8 and w r>8 satisfy the condition for inversion of the order of summation, 
and by subtraction u r 8 does so. Hence for any absolutely convergent double series the 
order of summation can be inverted. As a corollary, if Ea r and Xb s are absolutely 
convergent, 2 2 = 2 2 «A, where the terms may be taken in any order in the sum 

on the right. 

If «„(*) is never negative and if f P« m fe dx exists for all p, q and has limits as each of 

m=lj 0 J 

p, q tends to infinity with the other fixed, then 

m?i Jo Um ^ dX = /o ^ U m( X )} dx ‘ (12) 

# «»(*) is not always of the same sign, but | P | u m (x) \ dx exists and satisfies the same 
conditions, and if one of m 1 0 

2 I \ujx)\dx, rx\u m (x)ldx 
1 J 0 Jo 1 

exists, then both the limits in (12) exist and the two are equal. 

limits^*’^ 18 non ' negative ’ sub J ect to similar conditions on the existence of the »m g i. 


Jo dX L ^ {X ’ V)dy = /„” Ml y) 


(13) 

where we take/fey) to be the assumed common value for the two integrals for upper 
rmirn x, y As before, if f>(x, y) is not always of the same sign, a sufficient condition for 
existence and equahty of the double limits is that one of them shall exist for \4><x,y)\. 

An* 2 ' V n “ 0r “ emergence of sequences and series. The terms of a sequence 
{/„(*)} may be functions of a variable r. Then if the sequence converges for all values of * in 
an interval, its hunt is a function of r, say/fa). If we choose an arbitrarily small positive e 
we shall for any r be able to ohoose n(x) so that | f p (x) -fix) | < e for all p > n(x) because the 
sequence converges. In general the least value ofn(x) such that this is true will depend on 
. ut it may be possible to choose an n independent of n such that 1/ tx)-f(x) l <e for 
aU p>„ and for all , in the interval. If this is possible for every e f Z is said to be 
uniformly convergent to f(x) in the interval. It can fail to be uniformly convlgent^Lre 

, an *’ f y /' ™ thm or at the end of interval such that if we take a succession of 
vahies of r, tending to c, the corresponding values of n(x) for given s tend to infinity 

medial T 7 b<> f 6 SU “ °l tbe firSt * terms of a series . aU these statements have im¬ 
mediate analogues for senes Zu n (x) over an interval of x. Thus the series converges 


3g Uniform convergence 

for all x such that 0^x<l, but it is not uniformly convergent for all such x. 

e and then choose n so as to make 

x n (l-xv +1 ) 

x n +x n+1 + ...+x n+p = -|—-— 


1113 

For if we fix 


( 1 ) 


less than e for all p > 1 we must make 

x n < {l-x)e, 


(2) 


and therefore 


n> 


log{(!-a;)e} 
log a: 


(3) 


which tends to infinity as x tends to 1. This series is therefore uniformly convergent m 
arange a^x^b, where a and b are fixed quantities between 0 and + 1, since we can choose 
n greater than the greater of the quantities 

log{(l-a)e} log{(l-&)g} 
log a ’ log 6 

and the same value of n will then do for any intermediate value of*. It is convergent for any 
x such that - 1 < x < 1. But it is not uniformly convergent in the range -l<x< l because, 
even though the signs < exclude the possibilities that x may be actually - 1 or +1, they 
permit any intermediate value, however close to 1, and however we choose n we shah 

always be able to find values of x such that (3) is false. 

If f n ( x ) ->f{x) uniformly in each of a finite set of intervals {a r , b r ) {r = 1 to k), then it does 

so uniformly in the whole set. For each interval, n r exists such that j f p (x) -/(*) I < e for 
p> n r and x in (a r , b r ). Take m equal to the greatest of the n r ; then for p > m, | f p (x) -f{x) | < e 
for x in any of the intervals. 

If f <*)->■/(*) in aix^b, and f n (x)-rflx) uniformly in a<x<b, then convergence « 
uniform in aixib. We need only apply the argument of the last paragraph to the 
open interval a<x<b and the special points a,b, and take m equal to the largest value 
of n for the three. This seems to be the basis of a common statement that a sequence 
cannot converge uniformly in an open interval. It can, but then it also converges uni¬ 
formly in the closed interval if it converges at the end-points. But if 

/„(■ 0) = 2», /.(l)-2* /„(*) = 2-» (Ocecl), 

Ifjx)} is uniformly convergent in the open interval hut not in the closed interval. 


1113. Continuity and integrability of uniformly convergent series. The sum of 

a series of continuous functions of x, uniformly convergent in a range, is itself a continuous 

function of x in the range. . . 

The integral with regard to x of the mm of a series, umformly convergent m a finite range 
of x, is the sum of the integrals of its terms, provided that the termini of the integral are m the 

”to C prove the first statement, let S(x) be the sum of the series. Then since the series is 
uniformly convergent, if a is a positive quantity we can choose n independent of * so 
that if S n (x) is the sum up to u n (x) 

\S(x)-S n {x)\<(o, \8{y)-S n {y)\<(o 


( 1 ) 
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.for all x, y in the range. But S n (x) is the sum of a finite number of continuous functions 
and therefore is continuous. Hence for any x we can choose 8 positive but so small that 


for all \ y—x \ <8. Therefore for such y 

\S(y)-S(x) | <3a), 

and by taking a) = Je and then choosing 8 in accordance with (2) we can make 

+ — S(x) | <e 

for all h satisfying 0 ^ h < 8. Hence S(x) is continuous (and therefore integrable). 
To prove the second statement, we have, if 

S(x) = £ w (z) + # n (a;), 

and | R n {x) | < (o for all x such that a^x^b, 


rb nb rb 

S(x)dx = I S n (x)dx+ I R n (x)dx 
J a J a J a 


and 


f R n {x)dx 
J a 


<G)(b — a), 


(2) 

(3) 

(4) 

(5) 

(6) 
(7) 


which is arbitrarily small. Hence by taking n large enough we can make 


Cb n rb 

8 n (x)dx= 2 u n (x)dx 

Ja r=lja 


( 8 ) 


as near as we like to £ 8(x) dx. The theorem is often expressed by saying that a uniformly 

convergent series of continuous functions can be integrated term by term in any finite range. 

A uniformly convergent series can also be integrated term by term if the terms are 
integrable but not necessarily continuous. If S{x) is integrable the argument from (5) 
still holds. Take n so that | R n (x) | < oj. The leap of R n (x) exceeds 2oj in no interval. 
S n (x) is integrable. Divide ( a,b) into a finite set of intervals so that the total length 
of those where the leap of exceeds w is less than S. Then the total length of those 
where the leap of S(x) exceeds 3 <o is less than 8. (o and 8 are arbitrary; hence S(x) is 
integrable. 


If /n(^) ->/(*) uniformly in (a, b), / n (£) d£ tends uniformly to /(£) d£. For if 

Ja Ja 


J a 


<to(x-a)^(o(b-a). 


1*114. Discontinuity associated with non -uniform convergence . The geometric 
series does not converge at either a;=lora; = -l and therefore does not define a value 
of the function at the limits; thus the question of continuity does not arise. But it is 
possible for a series to converge at certain values of x and yet not to be uniformly con- 
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vergent in a range approaching them. The example given by Stokes, who first discussed 
this property, was 

u n {x) = 

S u n (x) converges for all x, since 


/I_1_\ _/_1_]_\ 

n+lj \(w— l)a;+l nx+l}' 


( 1 ) 


*+* 1 1 /_ 1 _ 1 

ju u n\ x ) n + (n+p)a;+l 


( 2 ) 


Take x > 0. Then 


n+p 

2 u ni x ) 


n 


1 2 


( 3 ) 


and we can make this less than e by taking n large enough. If x = 0, the last bracket in 

(2) is 0 and the sum is < 1 jn. The series is therefore convergent for x?:0. But it is not 

uniformly convergent. For if the quantity on the right of (3) is greater than e the quantity 

on the left can be made greater than e by taking p large enough; and 1 jn is always positive. 

If, then, 2 

t-7 > e > (4) 

(n— l)a:+l ' ' 


that is, 


(n -!)*<-- 1> 


(5) 


the left of (3) will exceed e for p large enough; and to make the left of (3) less than e for 
2/e — 1 

all p we must take n > — - \- 1 . Hence, if we fix e at the start, the appropriate values 

of n increase without limit as x is made smaller, and the series is not uniformly convergent. 
Stokes described such series as converging with infinite slowness near x = 0. 

Now consider the sum of the series. We have for all x 


£«„(*)= (l-W)+2({- 

i nx ’ \ n+lj \1 nx+lj 

and the sum of the series, if x is not zero, is 3, since the terms on the right containing n 
tend to zero with increasing n. Hence the limit of the sum as x tends to 0 is 3. But if we 
put x zero first the terms in the second bracket cancel for all n, and the sum is 1. 

This example is artificial, but the functions used are quite simple, and it serves to 
illustrate the fact that the results of carrying out two limiting processes may be quite 
different according to which we do first. We have to make x tend to 0 and n to infinity. 
If we make x tend to 0 first and then n to infinity we get 1; if we make n tend to infinity 
first and then a; to 0 we get 3. 

1*115. Tests for uniform convergence. A necessary and sufficient condition that 
{u n (xj} shall be uniformly convergent in an interval (a, b) is that for any e we can choose m 
so that for all n^m, \ u n (x) — u m {x) \<e for every x of (a, 6). The proof given for simple 
sequences needs little alteration. (See 1-045.) 

1*1151. M test. If for all x in the interval considered j u n (x) | ^ v n , where v n is independent 
of x, and the series S v n converges, then 2 u n {x) is uniformly convergent in the interval. For 
we can choose n to make for all p ^ 0 

n+p 

2 
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1*1152-1*1153 M test for uniform convergence 

since E v n converges; and then for any x 


n+p 

E ^ n (x) 


n+p n+p 

^ E \ ^ n {x)\ ^ 2 v n < e - 


This test is known as Weierstrass’s M test. 

The use of comparison series for testing ordinary convergence rests on the same prin¬ 
ciple, and we need only state the theorem. If as n-> oo, j u n | < v n , and Ev n converges, 
then E u n converges. 

The M test is very simple to apply and we shall have numerous applications of it. 


1*1152. Extension of the M test. A modification of the M test is sometimes 
useful even for conditionally convergent series where we cannot find a convergent series 
of positive terms v n numerically greater than u n (x). Suppose that as n^oo, u n (x) tends 
uniformly to 0 (see below); that the terms of E u n (x) can be taken in batches of m without 
deranging the order, giving a series E U v (x)\ and that | U v (x) | <V v , where ET^is convergent. 
Then E U v (x) is uniformly convergent by the M test. It remains to show that in the 
conditions stated E u n (x) exists and is equal to E U v (x). 

Since u n {x) tends uniformly to 0, for any e we can choose n so that | u p (x) j <e for all 
p^n and for all a; in the range. Then if we take v for given n so that 


vm ^ n < {v +1) m, 


'LuJx)~Y i U v {x) 
i i 


= + •••+**(*) I* 


Take v so that 2 U ff (x) < |e, and so that all ] u p (x) | < —— for p > mv. Then 

v +1 

'Zu n (x)-'ZU v (x)\<e 
i i 


for all x, and all n > mv, and E u n (x) is uniformly convergent. 
This can be applied to the series 


S(-i) w 


i 


n 

n 2 +x 2 ’ 


For by taking the terms in pairs we get a series whose terms are < those of E- 

" n(n+ 1) 

and which therefore satisfies the M test. Also the general term is numerically < 1 /n for 

all x, and therefore tends uniformly to zero. Hence the series is uniformly convergent. 


1*1153. Abel’s lemma. Though the M test is the commonest in actual applications, 
series may be uniformly convergent and not satisfy it. Two more sensitive tests are based 
on Abel’s lemma. All these tests have analogues for integrals. 

If {v r } is a non-increasing sequence of non-negative quantities, and if the sums 


Sp — 

n 

satisfy ttie inequalities h^s p ^ H for all p, then hv x < 2 a P v p < Hv x for all n. We have 
a x — s x , a % = s 2 — 5 1 > ••• > a n = s n~ s n-l> 

71 71 

& n ~ E Up Vp — s i v i + S ( s p s p— l) V p 

1 p— 2 

= S l( V l -v 2 )+...+ s n _ x (v n _ x - 


Vj+Wn- 
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Abel's test 1 * 1154-1 *1155 

Since all v p — v p+ ^ > 0 and. v n ^ 0, the last sum will not be decreased if all the s p are replaced 
by H; and therefore S n < Hv x . Similarly the sum will not be increased if all the s p are 
replaced by h; hence S n ^ hv x . 


1 • 1154. Abel’s test. If the series 2 a r is convergent {not necessarily absolutely) and if for 
all x in an interval {v r {x)) is a sequence of positive quantities, bounded with respect to x and r 
and non-increasing for given x as r increases, then 2,a r v r (x) is uniformly convergent in the 
interval. In Abel’s lemma take 

n n 

S n ~ 2 ®m+p> $ n (x) = 2 a m+p v m+p( x ) 

P =1 p=l 

and take m so that | s n | < w for all n. Then by Abel’s lemma, with 

h = —oj, H = a), v x = v m+1 (x) < M, 

— b)M < S n (x) < o)M 

for all x of the interval and all n ^ 1. Since o) is arbitrarily small and independent of x, 
uniform convergence follows.* 

The most important application of this theorem is to power series 2 a n x n . If the series 
converges for x = 1 , the powers x n , for 0 < x ^ 1 , satisfy the conditions imposed on v n (x); 
hence the series 'Za n x n converges uniformly up to x = 1 and the limit of its sum is 2 a n . 
This is Abel’s theorem. It saves a great deal of trouble; for we often get a result in the form 
of a power series and want to know whether the sum of the series for x<l tends in the 
limit to the sum for x = 1 when x is made to approach 1 . The theorem gives us a simple 
answer: it does so provided the series for x = 1 converges. 

The theorem is still true if a n = a n (%) and 2 a n (x) is uniformly convergent. The proof 
needs no change. 


1 • 1155. Dirichlet-Hardy test .f If in an interval ofx, 2 a r (%) is uniformly bounded with 

i 

respect to n and x, and {v r } is a sequence of positive non-increasing quantities tending to 
zero, then 2 a r (x) v r is uniformly convergent in the interval. We can extend this to the case 
where v r — v r (x) provided that v r (x) -*■ 0 uniformly. 

n 

Take S n {x) = 2 a m +p v m+p where m is such that v m {x) < <o. Then if, for all n, 

p=i 

n 

— M ^ X a m+p {x) < M, 

p~i 

we have by Abel’s lemma — Mco < S n (x) < Ma>. 

Uniform convergence follows. 

A remarkable feature of this test is that it establishes uniform convergence without 
requiring any comparison series to converge. The most important applications are to 
series of the forms 2 v n cos nO, 2 v n sin nd. Here 


2 cos nd 
i 


sin (rc +1)0 —sin 10 cos \Q - cos {n + £) 0 

o 1/3 5 2 j sm — c, •_ 1 a 


2 sin \Q 


2sin^0 


* The case where v n (x) = x n 1) was proved and used by Abel. The general form of the 

theorem is due to Hardy. 

f Dirichlet gave a test for convergence of a series of constants, which Hardy converted into a 
test for uniform convergence. Hardy proposed to call the tests Abel’s and Dirichlet’s respectively, 
but the application to uniform convergence in the latter case is entirely due to Hardy. 





1 * 116 - 1*117 Bounded convergence 43 

If sin f 8, with \rr > 8 > 0, is the smallest value of | sin \6 | in a range, the modulus of 
neither sum exceeds cosec \8, whatever n and 6 may be. If then v 1 ^v 2 ^...^v n ->0, it 
follows that 21 v n cos nd and 2 v n sin nd are uniformly convergent in any closed interval 
a^O^b that contains no zero of sin \6\ that is, excluding 6 = 0, 2n, 47r, .... 

In particular, the series 

1 + cos 6 + \ cos 2 6 + \ cos 3 6+ 
sin 6 + £ sin 26 + £ sin 36 +..., 

are uniformly convergent in any range a^d^b that excludes 0, 27 t, .... Actually the first 
diverges at 6 = 0, the second converges everywhere, but not uniformly in any interval 
containing 6 = 0. We shall see later (ch. 14, ex. 4) that it jumps from — ^ 7 r to ^ 7 r as 6 
increases through 0, so that non-uniform convergence is associated with a discontinuity 
as in 1*114. 

1*116. Theorem of bounded convergence. Uniformity of convergence is a sufficient 
condition for continuity or integrability of the sum, provided the separate terms are 
continuous or integrable. It is far from a necessary condition. In practice it is usually 
easier to test directly whether the limit function is integrable than to test for uniform 
convergence, and there are so many cases where the passage to the limit under the in¬ 
tegral sign is valid without convergence being uniform that a more general rule is needed. 
Such a rule is as follows. It is known as the theorem of bounded convergence ». If for all 

a^x^b, | f n (x) | < M for all n and x, if all f n (x) are integrable and iff n (x) ->/(*), where f(x) is 
rb rb 

integrable, then f n (x)dx-+ I f{x) dx. The proof is not easy, but the result should be known. 
J a J a 

The behaviour of f n (x) and 

lim fn(*)dx, lim f n {x)dx 

n—>ao J Q JO n—>co 

should be studied for the cases 

fn(x) = xe~™, f n (x) = nxe~ nx , f n (x) = n*xe~™. 


1*117. Useful comparison series. By far the most important comparison series are 
!ux n (0^x< 1), hn~ s (s> 1), which we have already studied, and '£n s a n (0^a< 1). The 
convergence of the latter follows at once from the M test if s < 0. If s ^ 0, we have 


u, 


n +1 


U, 


-m- 


As n increases this tends to a. Hence, since a<l, we can take m large enough for this 
ratio to be less than b for all n>m, where a < b < 1. Then for n>m, 

I 


and 2 b n ~ m is a convergent series of positive terms. Hence 2 n 8 a n converges for 0 < a < 1. 
Comparison with the series 2 n~ s can often be simplified. If v n = n~ s (1 < s), 


n 


’('-S-J-'’ 1 - 


If u n is positive for all n, and 



44 Uniform convergence of infinite integrals 

we can take s = f (t + 1), and then we can choose m so that for all n>m 


112 


n 


and then 


(m \ 8 

Un<Um \n) 


and E u n converges. Similarly, if t exists and is less than 1, E u n diverges. If t = la more 
sensitive test is needed, but we shall find no such case in this book. 

To summarize, if u n > 0, E u n converges if either 


or if 


u 

u. 


uju n - 1 ->k<l, 

'n —1 \ 1 / 


1*12. Uniform convergence of infinite integrals. If the integrand depends on x 

and also on another parameter y, the notion of uniform convergence arises as for series. 

rx 

We shall suppose in all cases that I f(x) dx exists however large X may be. This remark is 

needed because no meaning can be attached to the convergence of an integral, that is, to 
the proposition that a set of integrals with finite upper termini tend to a limit when the 
upper terminus tends to infinity, unless these integrals all exist. It is with the convergence 
of the infinite integral, assuming the existence of the finite integrals, that we are concerned 

rx C Y 

m what follows. In particular, if J ^ f(x) dx exists and I | f(x) \dx<e, for all Y>X, 
f(x)dx converges; but the existence of \f(x)\dx does not guarantee that of 

J o .f( x ) dx. If for any e we can choose X so that for all Y greater than X and for all y in 

the range b 0 to b x \ rr \ 

I f(x,y)dx\ < e 

the integral f(x,y)dx is said to be uniformly convergent in the range b 0 ^y^b L . This 

property permits the reversal of the order of integration in a repeated integral even when 
one of the limits is infinite. By a repeated integral we mean one of the form 


rbi Cat 

dy\ f(x, 

v ^0 J flo 


y)dx t 


where f(x,y) is to be integrated with regard to x between a 0 and a x and the result with 
regard to y from b 0 to b x . Let us consider the integral, where f(x, y) is supposed continuous 
with regard to both x and y, 


rb 1 r<:O rfe, i rx rco 

= dy f(x,y)dx= dy\\ f(x,y)dx+\ f{x,y)dx 

J bo J ci J \*) ci 4/ JST 


(i) 


Now since all limits are finite 


rbi rx rx rb, 

dy\ f(x,y)dx = \ dx\ f(x,y)dy. 


(2) 
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1121 - 1-122 Uniform convergence of infinite integrals 

(The proof is simple.*) X is at our disposal; choose it so that for all Y > X 


j: 


f(x, y) dx 


<(t). 


Then the second part of (1) is numerically not greater than {b t - b Q ) (o; and 


-f dx\ f(x,y)dy 
J a J b 0 


<(b 1 -b Q )0). 


( 3 ) 


( 4 ) 


But 0 ) is arbitrarily small and we can always choose X so that (3) will be satisfied. Hence 


/=hm f dx\ f(x,y)dy = f dx^ f(x,y)dy, (5) 

X->oo J a Jb, Ja Jb, 

which establishes the theorem. 

This theorem can be stated in the form: a uniformly convergent integral can be integrated 
under the integral sign. It follows that an infinite integral P°/(:r, y) dx can be differentiated 

J a 

under the integral sign with regard to y provided that dfjdy exists and that its integral 
with regard to a: is uniformly convergent in the neighbourhood of the value of y under 
consideration. This follows immediately by putting dfjdy for/(a, y) in the last theorem. 

An extension to uniformly convergent integrals, where f{x, y) is not necessarily con¬ 
tinuous, can be made on the lines of the argument at the end of 1-113. 


1*121. M test. The commonest test for uniform convergence is the analogue of the 
M test for series. If for all y such that b 0 ^y^ b x , 

I f{x,y)\<g(x), 

where g(x) dx converges, J / (x, y) dx is uniformly and absolutely convergent in 6 0 < y < b x . 


1 *122. Abel s lemma for integrals. If v(x) is non-negative, bounded in a ^x^b and 
non-increasing with x, and if h, H are the lower and upper bounds of 


for a^^^b, then 
Put 


= f f{x)dx 

b * a 

hv(a) ^ f(x) v(x) dx < Hv(a). 

J a 

I = f fix) v(x)dx = f v(x)dF(x) 
J a Jx=a 

= v(b) F(b) — f F(x) dv(x). 

J x=a 


This is valid because F(x) is an integral and therefore continuous, and v(x) is of bounded 
variation. Then, since v(x) is nowhere increasing, I will not be decreased if F(x) is replaced 
everywhere by its upper bound, or increased if it is replaced by its lower bound; then 

^{*>(6) ~ j x _ dv(x )J < I < H\v{b) - j b dv[i c)j, 
that is, hv{a) < I ^ Hv{a). 


* For f(x, y) continuous; the statement will be seen 
conditions. 


in Chapter 5 to be true under somewhat wider 


46 Uniform convergence of infinite integrals 1*123-1*124 

It is necessary for integrals to specify that v(x) is bounded; being non-negative it must 
have a lower bound, but it might be unbounded near x = a if this is not stated separately. 

J oo 

f(x) dx converges {not necessarily abso¬ 
lutely) amd if for every value of y in b Q < y < b x the function v(x,y) is non-negative, bounded 

1*00 

for all x, y and never increasing with x, then f(x) v{x, y) dx is uniformly convergent with 

J a 

respect to y in b 0 ^y^b v 
We have 0 ^ v(x, y)^M; take X so that 


& 


f(x)dx 


<o) for all X' >X; 


then by Abel’s lemma, for b 0 ^y<b i 

rx' 


j: 


f(x)v{x, y)dx 


<<oM , 


whence uniform convergence follows since oj can be taken arbitrarily small. 


For instance 


j* oo 

’Jo 


sin a; 


-dx converges; and e~ xy is positive, bounded and not increasing 


/•« sin oc 

with x for 0 ^ y < oo. Hence e~ xy - dx is uniformly convergent for y ^ 0. 

J o x 

1*124. Dirichlet-Hardy test for infinite integrals. If f f(x,y)dx is bounded for 

J a 

all X> a and for 6 0 <y < b x , and if v(x) is bounded, positive, non-increasing, and tends to 

f*co 

zero as x->co, then I f(x, y) v(x) dx is uniformly convergent for b 0 ^y^ 6 X . Here we can take 
J a 


X so that v(X) < (i) and for all X' > X there is M such that 

[f{x,y)dx<M. 


-M< 

Then for b 0 ^ y < b x and all X' > X 


j:- 


f{x,y)v(x)dx 


^ Mo) . 


Uniform convergence follows as before. 
rx 

Note that f(x, y) dx is not required to tend to a limit as X -> oo; it may oscillate finitely. 

J a 


For instance, 


c x 

sinxydx 

Jo 


1 — cos Xy 

y 


and if | y | > S > 0 this is numerically less than 2/S. Also 1/x is positive and tends to zero 

with increasing x. Hence f SU1 - — dx, where a > 0, is uniformly convergent in any range 

such that | y | > 8> 0. It is not uniformly convergent in any range that includes y = 0. 
Actually it is equal to + \tt for y > 0, — \rr for y < 0, and 0 for y = 0; so that, as for series, 
non-uniform convergence of an integral can be associated with discontinuity of its value. 
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Uniform convergence of an integral of a continuous function is a very useful sufficient 
condition for continuity of the integral and for the legitimacy of integration under the 
integral sign. We have had one case where non-uniform convergence is associated with 
discontinuity of the integral. The following example, given by Courant,* shows that it 
can be associated with the impossibility of reversing the order of integration. If 

f(x,y) = (2 -xy)xye~ x v, 

/•l /*» r oo /*1 

dx\ f{x,y)dy = 0, dy f(x,y)dx = 1. 

J o Jo Jo Jo 


we find 
We have 


J f(x, u) du = xy^er xv . 

f*CO 

For any x 4= 0 this tends to 0 as y oo; and for x — 0, f(x, y) — 0 for all y. Thus f(x, y) dy 

Jo 

is convergent, but it is not uniformly convergent near x = 0, since if tj is the larger value 
of y that makes xy 2 er xy — e, xy^e~ xy < e for all y>rj\ but y tends to infinity as x-> 0. In 
rv 

fact f(x, u) du is unbounded with regard to y as x -» 0. 

J o 

The extension of the results for integrals with respect to two variables to integrals 
with respect to three and more variables involves no new principles. 

The following application of Dirichlet’s test is sometimes useful. Let 


/•oo 

J a 


cos {f(x)}dx, 


wher ef'(x) is a positive increasing function for x^a, and/'(a?)->oo as a;-»oo. Put f{x) — y, 
f'(x) — 1 lg{y), f{a) — b . Then y is an increasing function of x, and 


■£ 


cos yg{y)dy. 


But g(y) > 0, and is a decreasing function with limit 0. Hence a sufficient condition for 

/•oo rco 

I cos {/(*)} dx,\ sin {/(*)} dx to converge is that f'(%) is an increasing function tending to oo. 


For instance 


/•oo /*oo 

cos x 2 dx, cos (x 9 — mx)dx (m real) 

Jo Jo 


converge. (For the latter, if m > 0, take a > 

1*125. Integrals with upper limit tending to infinity. If/(a, n)->g(x), and A n ->oo 
when n->co, we sometimes need a condition that 


/*A„ /*» 

f(x, n) dx -> g(x) dx. 
J a J a 


(1) 


The question is clearly related to that of uniform convergence; in fact we can define a 
function 

h(x,n) =f(x,n) (a^x^X n ), h(x,n) = 0 (A w < a?) (2) 


* Differential and Integral Calculus, 2, 1936, 316. 
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and then 


Inversion of limiting processes 

f(x, n) dx = h(x, n) dx. 

J a J a 


1*126 

(3) 


Consequently a sufficient condition for (1) is that h(x, n) shall satisfy any of the tests of 
1*121, 1*123, and 1*124. 

Detailed proofs of the required forms of 1 • 123 and 1 • 124, and of the analogues for series, 
are given by Bromwich* under the name of Tannery’s theorem. 

1*126. Inversion of infinite double series and repeated integrals. The theorem 
(due to E. H. Moore) for uniform convergence corresponding to 1*111 is as follows. 

If as y->co,f (x, y) ->h(x), and if as x~> oo, f{x, y) -> g(y) uniformly, then 

lim h{x), lim g{y) 


(1) 


y-> ® 


both exist and are equal. Take X so that | f(x, y) — g(y) | < (o for x^X and all y. Then take Y 
so that | f(X, y) — h(X) J < oj for y ^ Y. Then for x > X and y ^ Y 

f(x, y) - h(X) = {f{x, y) - g(y)} - {f{X, y) - g(y)} + {f(X, y) - 

\f{x,y)-h{X) | <3o), \g(y)-h(X) | ^3&>, J 

and therefore if y t > Y, 

\9(yi)-9(X) ( 3 ) 

Since oj is arbitrary, g(y) has a limit F. 

Hence we can take Y' so that | g(y) — F | < oj for all y > Y'. If Y" is the greater of Y, Y', 
and x>X,y> Y", 

\f(x>y)-g(y)\ + \g(y)-i'\ ( 4 ) 

Let y-> oo; then | h(x) — F | ^ 2a) and h(x)-+F. 

There are corollaries for sums of double series, sums of infinite integrals, and repeated 
integrals, analogous to those of 1*111. 

mu 

For series, if f(m, n) = 2 2 u ( r > s )> (5) 

r= 1 s —1 

the conditions are: as n -> oo, f(m, n) converges for any m, and as m-> oo, f(m, n) converges 
uniformly to g(n) for all n. 


If 


m f*x 

f(m, x) = H u m {£) di, 

r=l JO 


( 6 ) 


the conditions are: the integrals converge as x-+oo for any m, and the sum for finite x 
converges uniformly for all x greater than some x 0 . Alternatively, taking 


(*3C ( 77t \ 

f(m,x) = J 0 { 


we have the conditions: the series converges over any finite interval of £, and the integral 
converges uniformly for all m greater than m 0 . 


If 


f(x,y) = f di f <}>(£, v)dy = \ V dy f <f>(£, 7j)d£, 

Jo Jo Jo Jo 


(7) 


the conditions are: the integral with regard to y converges in any interval of £, and the 
second integral converges uniformly for all y greater than some Y. 

* Theory of Infinite Series, 1908, 123, 438, 443. 
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1*13. Mean-value theorems. We have seen that a continuous function takes 
its upper and lower bounds in any interval. Let f(x) be continuous for a^x^b and 
have a derivative f'(x) for a<x<b, and let f(a) = f(b) = 0, but for some intermediate 
value £, /(£) > 0. Then let x = tj correspond to the upper bound of f(x) in the interval. 
7) is not equal to a or b, since/(?/) ^/(£) > 0. Now 


f'(v) = Hm 

0 


f(V + h)-f(y) 
h 


If h> + h) </(??), and therefore/'^) <0. Ifh< 0 J(tj + h) and therefore/'^) ^ 0. 

These are consistent only if /’(tj) = 0. f'(x) is not required to be continuous. 

If f( x ) ^ as a lower bound < 0 in the interval, we obtain the same result by applying 
the argument to —f(x). Hence iff{x) has a derivative in a<x<b, and is continuous in 
a^x^b,the derivative vanishes between any two zeros off(x). This result is known as Rolle’s 
theorem. 

lic,d are any two values of x such that f(x) is continuous for c ^ x ^ d and has a derivative 
for c<x<d, consider the function 

g(x) {/(<*)-/(<;)}. 


This vanishes for x = c and x = d 
c and d, say tj ; and then 

o = g'(v) =f'(v)~ 


Its derivative therefore vanishes for some x between 

m-m 

d-c * 


Thus 


m-m = (d-c)/'(77), 


where c<7j<d. This is the mean-value theorem for derivatives. Geometrically it states 
that if we take a chord of a smooth curve, the tangent at some intermediate point is 
parallel to the chord. 

The most important application is: if f(x) is continuous in a^x^b and f{x) — 0 for 
a<x<b, thenf{x) is constant in ( a,b ). For f(x)-f(a) = (#-«)/'(£), where «<£<#; but 
/'(£) = 0 and therefore f(x) = f(a). Note that it is not sufficient that f'(x) = 0 almost 
everywhere. A function is known, continuous in (0,1), and with a derivative almost 
everywhere zero in (0,1), but the function is not constant. But it is sufficient that 
f'(x) = 0 except at a finite set of points, where f(x) is continuous. If f(x) is continuous, 

Cx 

and F(x) = f(u) du, then 

J a 


dF{x) 

dx 


= /(*)• 


If also G'(x) —f(x), then F(x) — G{x) is a continuous function with zero derivative every¬ 
where and is therefore constant. The corresponding theorem where f(x) is given only to 
be integrable was given in 1-103. Either can be used to justify the method of integration 
by first finding a function whose derivative is f(x). Another consequence is useful in some 
cases where a derivative is required at a point x — a but for some special reason is difficult 
to evaluate there, owing, for instance, to failure of an integral or series representing it to 
converge. If f{x) is continuous and f'(x) exists except possibly at x = a, we have 

t! £±n=M -/-(,+»), 


JMP 


4 




50 First mean-value theorem for integrals 1*131-1*133 

where 0 < 6 < 1. Now let h tend to zero; then if the left side tends to a limit this limit is 
f'(a). But if f'(x) tends to a limit when x tends to a the right side tends to this limit, and 
the left side, being equal to the right, also has this limit. Hence if f{x) is continuous and 
f'(x) has a limit as x tends to a, f'{a) exists and is equal to this limit. 

“iW* + h) —f(a)} has a limit as h -> 0 through positive values, this limit may be called 

tv 

the derivative of f(x) on the right at a and denoted by f'(a+ ). The last argument applies 
equally to show that if f(x) is continuous on the right at a and f'(a + h) has a limit as h-> 0 
through positive values, f'(a + ) exists and is equal to this limit. Similar properties hold 
for derivatives on the left. The statement that f'(a) exists is equivalent to the statement 
that /'(a +) and f'(a — ) exist and are equal. 

1*131. The first mean - value theorem for integrals . If for all x such that a < x s$ h, 

m <f{x) <;M, 

m(b — a) ^ I f(x)dx^M(b — a), 

J a 

and therefore f f(x)dx = N(b —a), 

J a 

where N is such that m^N In particular if f(x) is continuous there is a £ in the 
range such that/(£) = N, and 

(* f(x)dx = (6-a)/(£). 

J a 


1*132. Extension of first mean-value theorem. To obtain Taylor’s theorem 
we require a special case of an extension of the first mean-value theorem; namely, if 
g(x) is ^ 0 for a < x < 6, and m </(#) < M, 

1*6 rb rb 

ml g(x)dx^ f(x)g(x)dx^M\ g{x)dx, 

J a J a J a 

which we can write in the form 

f f(x)g(x)dx — n[ g(x)dx (m < N ^ M) 

J Ct J & 

= /(£)[ g(z)dx (a ^£^6), 

J a 

the last being true for some £ if f{x) is continuous. 


1*133. Taylor’s theorem. Let f(x) have derivatives up to order n, and denote the 
nth derivative by / (M) (#). Consider the function 

On differentiating n — 1 times in succession we see that g(x) and all its derivatives up to 
the (n— l)th vanish at x = 0. Also 

<f n) (x) = f (n) (x). 
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Taylor 1 8 theorem 

(X — u ) n ~ 1 
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Now consider the integral h(x) =J h f (n \u) du, 

where f {n \u) is supposed bounded and integrable. (The reason for this choice will appear 
when we discuss operational methods.) 

By repeated integration by parts 


h(pc) = J 


' x (x — u) n ~ x 


0 (n— 1) 


{n 


' x {x — u) n ~ 2 


n —2 


'{x — u) 
o (n — 2)! 


T n ~ x \u)du 


5»— 1 r x (r _ n, \n-2 


= - (£rp)i (o) -*/'<o)+£/'(«) du 


Hence /(a;) + 

This is an exact form of Taylor’s theorem. It does not require T n \u) to be continuous. 
Also 

C x (x — u) n ~ x x n 

Jo (w-l)! dU = nV 

whence if m and M are the lower and upper bounds of f n \u) for 0 ^ u < x, the integral lies 
between mx n /n ! and Mx n /n !. Hence it is equal to Nx n jn\, where msc N^M. If further 
f n \u) is continuous there will be a value dx (0 < 6 < 1) that makes it equal to N, and the 
integral can be written 

This is Lagrange’s form of the remainder. But the form Nx n /n\ only requires the nth 
derivative to be integrable. 

An alternative form of the remainder, due to Cauchy, is got by noticing that, if fi n \u) 
is continuous, there is a 6 such that 0 < 6 ^ 1 and such that 


{x-u) n ~ x f( n \u)du = x(x — 6x) n ~ x T n \dx) 


by the first mean-value theorem for integrals; hence the remainder can be written 

/(*>«?*). 

It wfil be seen that these forms of Taylor’s theorem do not require the convergence of 
an infinite series. But they are much used in proving convergence by showing that the 
remainder tends to 0 for large n and in estimating the error possible for a given finite 
number of terms. 


1-134. The second mean-value theorem for integrals. We give first a simple 
consequence of Abel’s lemma for integrals. Let f(x) be integrable in (a, b) and ${x) have 
bounded variation. Let P{x), N(x) be the positive and negative variations of <fi(x) in (a x) 
and let the greater of P(b), N(b) be o>. Then a)-P(x), a>-N(x) satisfy the conditions 


4-2 
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Second mean-value theorem 


114 


imposed on v(x) in Abel’s lemma, and if h, H are the lower and upper bounds of f f(t) dt 

J a 

for a^x^b, 


o)h^ f {a> — P(x)}f(x)dx^:(j)H, 
J a 

(oh ^ J {o) — N (#)} f(x) dx < ct)H. 


— h). 


( 1 ) 

( 2 ) 

( 3 ) 


By subtraction, J J f(x) {<f>(x) — <fi{a)} da 

Hence f f(x) <f>(x) dx can be replaced by <j>{a) f f(x) dx within a known uncertainty. This 

J a J a 


result is important especially in the theory of Fourier series and integrals. 

The second mean-value theorem as usually understood means one of the following. 
Bonnet’s form: if </>{x) is positive and non-increasing for a^x^b, there is an tj such that 
a^ij^b, rv rb 

<f>{a)\ f(x)dx = \ f{x)<f>{x)dx. (4) 

a %/ & 


Du Bois-Reymond’s form: if <fi(x) is monotonic for a^x^b, there is a £ such that a ^ £ ^ 6, 

J f(x) (fix) dx = <}>(a)j S f(x) dx + 0(6)J^ /(*) dx. (5) 

Both are easily derived from Abel’s lemma; but, as Bromwich remarks,* they contain no 
information not contained in the lemma itself because no means is provided for estimating 
£ and r\\ and they are less directly informative than (3) above. a 


1*14. Infinite products. If 

II W = (1 +af) (1 +af )... (1 +af), (1) 

the limit of U n when n tends to infinity, if it exists and is not zero, is denoted by 

nu+aj, (2) 

i 

and the infinite product is then said to converge. It is zero if any factor is zero. (In the 
latter case it is often said to converge to zero to distinguish it from such a product as 

(!-*)(!-*) (1-*)-. 

which tends to zero without any factor being zero. Such products are said to diverge 
to zero.) 

The theory of convergence of infinite products is closely related to that of infinite 
series; in fact, if all a n are positive, or all negative, a necessary and sufficient condition 
for the convergence of (2) is that £ a n shall converge. We have 

S n = log n n =* 2 log (1 + a r ). (3) 

l 

Clearly neither 11(1 +a n ) nor Sa n can converge unless a n ->0. We need therefore only 
consider the case a n -> 0. Then 

log(l+q„) ^ 4 

a m ^ 

n 

* Theory of Infinite Series, 1908, 426-7. 
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Hence the ratios of corresponding terms of the two series 2 log (1 + a n ) and 2 a n are bounded 
and the series converge or not together. But if S n has a limit S, II n has a limit e s , and con¬ 
versely if n n tends to a limit, not oo or 0, S n tends to a limit; which proves the proposition. 

If the a n are not all of one sign and 2 | a n | is convergent, 11(1 +a n ) is easily shown to 
be convergent. If 2 | a n | is not convergent but a n ->0 we can choose m so that for n>m 

log (1 + a n) = a n -^n a l> 

where c < | A n | < d and c, d are fixed; and therefore, if 2 a\ is convergent and 2 a n is con¬ 
ditionally convergent, 11(1 +a n ) is convergent. 

( (— l) n l 

1 + V ^Jn j ’ ®- ere ^ a n conver g es but 2 a\ 
does not. This product is easily proved not convergent. 

1*15. The Lipschitz condition. If 

|/(£) -f{x)\<A\£-x\* 

for given x and all | £—x | < 8, where A, a are independent of £, and a > 0, /(£) is said to 
satisfy a Lipschitz condition of order a at £ = x. If/(£) satisfies a Lipschitz condition it is 
continuous at £ = x, and if it satisfies one for all x in a ^ x ^ b it is continuous for a^x^b. 
But even if cl = 1 the function need not be differentiable or have bounded variation. 
For instance, take 

/(0) = 0, f(x) = ssin^. 

JO 

This satisfies a Lipschitz condition of order 1 even at x = 0; but it is not differentiable at 
x = 0, and it has unbounded variation in any interval including x = 0. The function | x | 
satisfies a Lipschitz condition of order 1 at x = 0, and has bounded variation in any 
interval, but is not differentiable at x = 0. For some theorems it is found that the Lip¬ 
schitz condition is sufficient when continuity is not sufficient and differentiability is 
sufficient but not necessary. 

If a Lipschitz condition of order a > 1 is satisfied at x, clearly f'(x) = 0. If at every 
point of the interval it is satisfied for some a> 1, f'{x) = 0 throughout the interval. 
Hence f(x) is constant; consequently only the case 0 < a ^ 1 is of much interest. 

An important case is where f(x) satisfies a Lipschitz condition of order 1 uniformly in 
(a, 6); that is, if a constant A exists such that | f(x 2 ) —f(xy) | < A \ x 2 — x x | for all x v x 2 in 
(a, 6). Clearly/(a:) must be both continuous and of bounded variation. It is not necessarily 
differentiable at all points of (a, b), as we see from the example f(x) = j a: j in (— 1,1). But 
it can be proved that f(x) has a derivative almost everywhere and is the Lebesgue integral 
of a bounded function. This is a particular case of a rather difficult theorem due to W. H. 
and G. C. Young that a function of bounded variation has a derivative almost every¬ 
where;* but the special case is equivalent to the proposition that a curve of finite length 
has a tangent almost everywhere, an elementary proof of which has been given by 
A. S. Besicovitch.f 

* Proc. Lond. Math. Soc. (2)9, 1911, 326-35. Incidentally this is not true of all continuous 
functions. 

t J. Lond. Math. Soc. 19, 1944, 205-7. 
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1*16. Cauchy’s inequality. If a x ... a n , ... b n are any numbers 

{a\+...+a%)(b\+...+b^ l ) — (a 1 b 1 +...+a n b n ) 2 = (a 1 6 2 — ® 2 ^i) 2 +•••• 

This is an extension of Lagrange’s identity. Then 

iafib^(T,a r b r )\ 

ii l 

This is Cauchy’s inequality. It follows that if <f){x), i/r(x) are two functions, 

|* <l> 2 (x)dx[ \Jr 2 {x)dx^[ f <fi(x) i/r(x) dx | . 

This is Schwarz’s inequality. In particular if ^ = 1, <fi(x) = | f(x) |, 

(6 — a)j f 2 (x) dx ^ {J |/(*)|<*e} • 

Also if n = 2, and b\ + b\ = 1, (a 1 6 1 + a 2 6 2 ) 2 ^af+ «!• There are numerous applications 
of similar results. 


EXAMPLES 


1. Which, if any, of the field axioms given in 1*01 are not fulfilled by (1) the set of all positive 
integers, (2) the set consisting of zero and all square roots of positive integers? 

2. A student (naturally not at Cambridge) was heard to say that he had found a route from his 
lodgings to his lectures and back that was downhill both ways. Which of the axioms does his notion 
of height fail to satisfy? 

3. If s n 3* t n Js £ m _ x , and for every m there is an n such that t n > s m , and for every m there is a p 
such that s v > t m , and {»„} is bounded, then {£„} converges to the same limit as {«„}. 


4. Prove that if a 0 = 3, a n+1 = 3 — — , a B ->■ 2. 


Show graphically, using the curves y — 3 —, y — x, that the sequence has limit 2 for all values 


of o 0 except a 0 = 1. (Infinite values of a n are allowed.) 


(I.C. 1938.) 


5. If s n+1 = <J(2s n + ct), where Sj and a are positive and the positive value of the square root is 

taken, prove that as n tends to infinity, s n tends to the limit 1 + *J(a+ 1). (M.T. 1940.) 

n ( s i \ 

6. Prove that, if s> 1, then £ —— j — slogw 

tends to a limit as n-*oo, s remaining fixed; and that, if this limit is <p(s), then 


0 < 4 )(«) + 


1 

s — 1 


<s-l. 


(M.T. 1938.) 


7. Show how to derive a positive root of the equation 

x* + 4a; — 1 = 0 


by considering the convergence and the limit of the sequence defined by 


S'n+l — 

.Determine this root to four places of decimals. 


1 

a£ + 4‘ 


(I.C. 1937.) 
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8. Solve the following difference equations 

(!) Un+i 1 l t/ n + y n -i - 0, where y 0 = 0 ,y 1 = 1. 

(2) 2 /»+i + %« + 2z n = 0, z n+1 + 2z n + 2y n = 0. (I.C. 1941.) 

9. By expressing sin md and sinh mu in terms of exponentials, prove the identity 

sin0 


® sin md _ ® 

OT== lsinhm« n= i cosh (2n — 1) u — cos 6 


(u> 0, 6 real). 


(M.T. 1938.) 


10. If f(x) is equal to 0 for irrational values of x and to 1 for rational values, prove that xf(x) is 
continuous at x = 0 and nowhere else, and that x^f(x) is differentiable at x = 0 and nowhere else. 


I —- ft 1 / \ 

11. Prove that if a 1 + 1 --~ = II (1 - 2a cos — + a 2 I 

2 r=l \ n ) 


j: 


log(l — 2o cos a + o 2 ) da; = 0 


(M.T. 1939.) 


rr; + 


, + ... 


If a is real, prove that 

if | a | < 1, and find its value if | a | > 1. 

12. Prove that the sum of the series 

M . 

1+ |*| d+ \X\)» ' (1+ |*|)3 
exists for all real values of x but has a discontinuity. (M.T. 1938.) 

13. For what values of X is each of the following series uniformly convergent? 

oo (_ l)n 00 1 / 1\ 

(i) (ii) S-^l + £ + ...+-Jsinn*. (M/c, III, 1928.) 

14. Prove that log3 = l + J-f + J + l-f + f + A-f+ ..., 

and obtain a similar series for log o, where a is a positive integer. (Hint: use Abel’s test for uniform 
convergence.) (M/c, III, 1930.) 

15. Show that for any positive value of n 

—~ + ~~r + ••• + —r~ ) =log(l+x). (I.C. 1938.) 

»->«> \W + * n + 2x n + nxJ 1 

16. Prove that the binomial series 

“ n(n+l)...(n + r-l) 

1+ Zi --- x r 

r=l r\ 

converges for | * | < 1 and is unbounded for | * | > 1. Prove also that (1) if * = 1 the series converges 
for n^0, and is unbounded for n > 0, (2) if * = - 1 the series converges for n< 1, is unbounded for 
n> 1, and oscillates for n = 1. 

17. If for n>m, \ u n | 1/n <k < 1, show that S u n converges. Apply this rule to the series given by 

w 2n — 2 2n , W 2 n+1 = 3~ 2n_1 . 

Will the rule of 1-117 establish the convergence of this series? If not, what extension of the rule 
will do so? 


18. Show that the series 
converges or not according as s>f or s < 


£ £ l+m 

ll (Z 2 + m 2 )' 
3. 

a- 


(M/c, III, 1932.) 


S — S n -4 , T = E 2~ n , 
l l 


19. For the two series 

find how large n must be to make the error in stopping at the nth term (1) < 0-005, (2) < 0-0000005. 
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20. Prove that 


Examples 

E- —-— 
n(logn) 3 ’ 


converges or tends to infinity according as p> 1 or 1. 

Hence show that if 

u n > 0, -^-->1, nil ——Wl, and logn [n (l - - l] -»fc> 1 

M„_l \ ttn-l/ l \ «W } 

the series Em„ converges. 


21. Discuss the convergence of the series 

ax a(a + 2) o(a + 2)(a + 4) 


1+-T + 


ar + 


for positive values of a, 6, x. 


b 6(6+1) 6(6+1) (6 + 2) 


ar* + ... 


(I.C. 1944.) 


aB a(a+ 1)... (a + w— +1)+ (/?+« — 1) 

22. Show that 1 + ^*+ — + --'— , , - x n + ... 

2 y n ! y(y +1)... (y+ w— 1) 


converges if 0^z< 1, and if x = 1 and y>cc + fi. 

23. Prove that the product of two Riemann-integrable functions has a Riemann integral. 

24. Prove the result of 1-104 by integration by parts. 

1 


25. If 
prove that 

26. If 


S = S--, 

1V+1’ 

£ + £7r<£<£ + £flr. 


N ■ £ l A 

f(x) = S a n ooa\ n x7T, g(x) = S a„cos I X n x + ~\n, 
n«=o n**0 \ x ' 


where the a n and A n are real, prove that if f(x) or g{x) has no zero in the interval —< x < —, where 

m+1 m 

m is a positive integer, then the other function has at least one zero in the interior of this interval. 

(M.T. 1938.) 


27. Prove that 


/>£<£&*--*• J>J 7 S^ =i - (MTm3) 


28. Investigate the convergence of the infinite products n(l + « n ), where 

(-1)" (-l) n 

d)w» = L n5T-. (2)«„ = 


n 1/s 


log(n+l) 


(M/c, III, 1928.) 


29. By considering f(x) = Jxcob-, g(x) = Jxsin-, show that it is not a sufficient condition for 

x x 

the existence of f f(x) dg[x) that f{x) and g(x) shall both be continuous. 

J !C=0 

30. If f(x) has derivatives up to the (n- l)th for —a<x<a, and if / (n) (0) exists, prove that 


/(*)=/(0)+ S/W(0)- + o(a:»). 

r=l r - 


31. If f{x) has a derivative f'(x) for a^x^b, and if /'(a) <p<f'(b), then there is a £ such that - 
a<£<6 and/'(£) = p. (/'(£) is not assumed continuous). 

32. fjx) is non-decreasing with respect to a? in (a, 6) or (- co, oo) and uniformly bounded 
with respect to n; and lim /„( x) =/(*). Prove that convergence is uniform in any interval that 

n—> oo 

includes no discontinuity of /( x). 



Chapter 2 

SCALARS AND VECTORS 

‘The moral of that is, “Take care of the sense and the sounds will take care of themselves”.’ 

lewis Carroll, Alice in Wonderland 

2*01. Cartesian coordinates: summation convention. Any physical measure¬ 
ment is the assignment of a single magnitude. Physics may be defined as the study of the 
relations between magnitudes, so that from one set of measurements other sets, given the 
conditions of observation, can be predicted. The most elementary measures, except for 
simple counting, are those of distance. Now we saw that distances along the same straight 
linn are found experimentally to be additive in a definable sense, and to satisfy the 
associative and commutative laws of addition. But when two distances are not along the 
same straight line it is found that they no longer have a unique sum unless another 
condition is provided; if P, Q, R are three points, it is not true that the distance PR 
depends only on PQ and QR. There is, however, an experimentally verifiable relation 
between distances along any two intersecting lines PQQ' and PRR', namely, 

PQ 2 + PR 2 -QR 2 PQ ,2 +PR' 2 -Q’R' 2 

2 PQ.PR 2 PQ’.PR' ’ { } 

when P is not between Q and Q' nor between R and R'. This ratio (a number) is denoted 
by cos 6, and Q is called the angle between the lines. Cos 0 is never less than — 1 or greater 
than +1. Now that measurement has largely replaced 

Euclid’s methods and the experimental treatment of R" 

‘geometry’ is advocated, it is desirable that one of 
the first steps in teaching should be the direct verifica¬ 
tion of this law, and that it should be made the basis 
of the development of the subject. It is far better 
verified than some of the usual axioms. It makes angle p Q Q' 

a derived magnitude, and the additive property of 

angles in a plane, taken as a postulate by Euclid, can be deduced from it. This is all 
to the good because a plane is a more difficult idea than a straight line. From (1) the whole 
Euclidean theory can be developed up to the introduction of rectangular coordinates.* 

In Euclid’s methods the notion of superposition plays a prominent part, and he is 
always speaking of the actual things compared. The tendency of modem teaching is to 
try to avoid superposition. But it is directly related to physical methods, and Euclid’s 
language makes it impossible to confuse, say, a length with an area. The language of 
physical magnitude can say things easily that have physical meaning and would be 
difficult or impossible to express in Euclid’s. But the attempt to reduce his system to 
pure mathematics removes what, for physics, are its outstanding good points. 

Rectangular coordinates have the property that any distance between two points can 
be expressed in terms of them symmetrically by a sum of three squares of their differences. 

* H. Jeffreys, Scientific Inference, Chapter 7; other considerations concerning physical magni¬ 
tudes will be found in Chapters 4 and 6. It is particularly important to recognize that the 
establishment of scientific laws is a matter of successive approximation. 
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This property is shared by no other way of specifying position. The common statement 
that rectangular coordinates have no special physical significance is nonsense. In pro¬ 
jective geometry the notion of distance is rejected because it is metrical. In physics we 
cannot do without it. But when it is introduced we can consider what is the shortest 
distance between a point and the points on a given plane (defined as the locus of points 
equidistant from two given points), and this leads directly to the notions of perpen¬ 
diculars and rectangular coordinates. 

A rectangular coordinate is a distance from a given plane; we agree to associate it with 
a positive sign for points on one side of the plane and a negative sign for points on the 
other side. Then we can speak of the displacement from P to Q as specified by three 
components, namely, the differences of the rectangular coordinates of the two points, and 
these differences are equal to the projections of PQ on the three axes used. As each is a 
distance along a given line, measured in a given direction, they have the additive pro¬ 
perty, and can be taken in any order. That is, starting from P, we can find a point P' 
with the same y and z coordinates as P but the x coordinate of Q; then a point P" with 
the same x and z coordinates as P' and the y coordinate of Q; and finally we get to Q by 
varying the z coordinate. But we could adjust the coordinates in six different orders and 
still reach Q in three steps. This would still be true with any kind of coordinates, but with 
rectangular coordinates the three displacements have a special property, that each is in 
a given direction and has a given magnitude. (This is also true with oblique Cartesian 
coordinates, but these are used only in special applications and we shall not treat them 
till Chapter 4.) In a sense, then, we can regard parallel displacements of the same amount 
as equivalent. This is a particular case of the parallelogram law, but the latter is used in 
its general form only for oblique axes and we are not concerned with it at present. We can 
regard the equivalence as representing a physical process by supposing the displacements 
transferred bodily to their new starting points by means of a T-square and set-square. 
The process, however, is not of frequent application. 

The really important property of rectangular coordinates is that they express the 
properties of distance equally well, and in the same form, whatever directions we take 
for the axes, subject only to their being mutually perpendicular. We do often want to 
change axes, and we require a way of inferring the coordinates with regard to one set of 
axes, given their values with respect to another. Now the components of a displacement 
with regard to the new set will be its projections on the new axes, with due attention to 
sign. Now if a line RS makes angles ac, fi, y with the old axes, and the displacement PQ 
has components u, v, w with respect to them, then the projection of PQ on RS is 

«tcosa + v cos fi+w cosy. (2) 

This notation is cumbersome. It is usual to shorten it by denoting the three cosines by 
l, m, n and to call them the direction cosines of the line RS; then the projection becomes 
lu + mv + nw. A further shortening is achieved if we denote the axes by x x , x 2 ; x 3 ; the com¬ 
ponents will then be denoted by u x , u 2 , u 3 and the direction cosines by l x , l 2 , l 3 . Then the 
projection is 

12^ (i = 1,2,3). (3) 

The advantage of suffix notation is that the most general laws of physics have the same 
form for all components. Hence if we have, say, differential equations for the three 
coordinates of a particle it is enough to write one equation in suffix notation and let it 



2*02—2*021 Transformation of coordinates 59 

be understood that the suffix is to take all the values 1, 2, 3 in turn. A further shortening 
of writing is obtained by the summation convention. We see that in (3) each term contains 
the same suffix twice, and the results are to be added. We make it a rule that in any expres¬ 
sion in suffix notation containing a repeated suffix, that suffix is to be given all possible 
values and the results then added. Thus for (3) we write simply k u i and leave the sum¬ 
mation to be understood from the convention. 

2*02. Transformation. Now if we have two sets of rectangular axes 0123, Ol'2'3' 
with a common origin O, we denote the two sets of coordinates by x it «'• respectively. These 
must be regarded as two different ways of saying where P is. We denote the direction 
cosines of a particular x'j axis with respect to the x i axes by l ti . Then x^ is the projection 1 
of OP on the cllre^Tion of the #'• axis and therefore is equal to l t j x i- This is true whether 
j = 1,2, or 3; hence 

x'j = kj x i- (4) 

This summarizes the three equations of the transformation, each of which has three 
terms on the right side. 

We have not used the condition that the axes xj are mutually perpendicular. The 
condition that x[ and x 2 are perpendicular is 

Ink* = 0, (5) 

with two similar relations for the other pairs. Again, since l n are direction cosines of the -C 
x[ axis referred to 0123 

kiki = 1 (6) 

with two similar relations. (We do not write l { j kj here because that would imply summation 
with regard to both i and j, which we do not intend.) Hence, though there are nine l ij} 
they are connected by six relations of the forms (5) and (6), and we should expect that 
only three of them can be assigned independently. This is actually true, but this argument 
must not be regarded as a proof. 

2*021. 8 ik . These six relations can be written as one. We introduce a set of numbers 
8 ik , where i and k can each be 1, 2, or 3, such that 8 ik = 1 if i = k, and 8 ik = 0 if i 4= k. 
Then we have 

= V (7) 

This set of quantities is called the substitution tensor \* in any expression containing a 
suffix k, not repeated, if we multiply by $ik and add, the only non-zero term is that with 
k = i, and the result is therefore to replace k by i. Note that we must distinguish between 
8 ik with i — k, which is 1, and which implies a summation and is 3. Note also that in 
l u the form of the expression itself indicates that the two Vs cannot possibly mean the 
same thing. It is unavoidable with suffix notation that the same letter may be required in 
twosenses, since so many letters already have special meanings, but a suffix simply identifies 
an axis and can take the values 1, 2, 3, and the same letter on the line stands for a physical 
magnitude. If this is borne in mind no confusion will be possible. The suffixes used here 
are i, k, m, p, ... for the original axes, j, l,n,q, ... for the transformed ones, o is omitted 

* See 3'03. We do not need the definition of tensors in general at present. 8 ik is a particular case 
of the ‘Kronecker 8\ 
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because it might be confused with a figure. Consecutive Greek or italic letters are often 
used in each system, but this is liable to lead to overrunning the end of the alphabet in 
one system or to require additional accents, which make writing more difficult. 

2*022. Reverse transformation. Since the #'• axes are an orthogonal set and the 
cosine of the angle between the x i and x' s axes is l ij} we have also 

. = hjX'j, (8) 

and since the x i set are orthogonal 

hjhj = ^ik' (9) 

This set of relations is deduced algebraically from (7) in 2*073. (This in itself is a warning 
against complete trust in the method of counting constants. We now have 12 relations 
between 9 quantities.) 

2*023. Velocity, acceleration, force. Definition of vector. Unit or direction 
vectors. Now in dynamics the equations of motion of a particle take the form 

mxi = X it (10) 

when the axes are inertial * (Inertial is better than the usual word fixed.) But if we take 
another inertial set the are not varying with time, and therefore 

x] = l tj x it x\ = 1^. (11) 

Hence the components of velocity and acceleration with respect to the x'- axes are related 

to the components with regard to the x t axes as the coordinates are. 

Now (10) are supposed to be true for any inertial axes; for different sets the force 
components X it X'j must have different values, but we must still have 

mx’j = Xp ( 12 ) 

however the axes are transformed, and quite irrespective of the actual values of X t . 
But this can be true only if 

X’i = hi** (13) 

Thus besides displacement we find that on change of axes velocity, acceleration and force 
all transform according to the rule (4): force on the supposition that the equations of 
motion are stated in a form true for all inertial axes. Thus for a particle moving under 
gravity, with 03 upwards, we have (x 1} x 2 ,x 3 ) = (0, 0, —g). But this is not true for 
other axes and the general form is Xi = —gl if where are the direction cosines of the 
upward vertical. 

Any three quantities that transform on rotation of axes according to the rule 

A'j = kjAi (14) 

are said to be the components of a vector with regard to those axes. Now if we start from 
any set of three equations that are true for every set of axes and work out from them any 
consequence, using a particular set of axes 01,02, 03, we could equally well have derived 
from the equations stated in terms of another set of axes 01', 02', 03' a consequence 

* It is not our purpose to explain in detail here how, and how far, Newtonian dynamics is based 
on experiment. A discussion will be found in Scientific Inference, Chapter 8. 
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differing formally from the first only in the appending of accents to all the letters. If the 
consequence consists of a set of three equations, and the left sides are the components of 
a yector, we know that the left sides in the two systems are related according to (14). 
But since both sets of equations are true the right sides also must transform according 
to (14) and be the components of a vector. Conversely if three equations assert the equality 
of all components of two vectors with respect to one set of axes, it follows from the fact 
that both sides transform according to the same rule that the set can be adapted at once 
to any other set of axes by merely inserting accents. 

If we take any two lines whose direction cosines with regard to the axes are n^, 
the cosine of the angle between them is vn^n^ If in particular the first line is the axis of 
x'p the cosine of the inclination to it of a line with direction cosines n i is 

n\ = (15) 

so that the direction cosines of a given line transform according to the rule (14) and are 
the components of a vector. Such a vector is often called a unit vector : we prefer direction 
vector, since the only application is to specify a direction. 

We can speak of the component of a vector in any direction. If a line has direction 
cosines % with respect to a set of axes, the component of the vector A i in that direction 
is Now suppose that we used the x] system and tried to find the component of the 
same vector in the same direction. It would be 

n'jAj — l ij n i t kj A k = (lijl k j) n^A k 

= = riiAi, (16) 

and therefore the component of a vector in any given direction is independent of the 
axes used. 

This result could also be derived by taking a third set of axes, one of which is in the 
direction of the line n^, and carrying out the transformation of axes first directly and then 
by way of the x'j set. 

A vector is often defined as an entity requiring three components for its specification, 
and with an additive property expressed by the parallelogram law. The latter part of the 
definition, however, supposes for its application that the vectors considered are to be 
represented by displacements with an arbitrary scale factor, usually dimensional, until 
this is done we do not know what we mean by the parallelogram law for that kind of 
vector. The introduction of the parallelogram is really an unnecessary complication, and 
it is better to proceed directly to the analytical statement of the required property. 
Again, the rule requires that we should know what we mean by addition for that kind of 
vector. We have natural interpretations for displacement, velocity and acceleration, and 
force. But we shall meet more complicated vectors such that it becomes difficult to find 
anything but an analytical definition of addition; and it is quite unnecessary that we 
should find one, since all that is required for our purposes is that our equations shall be 
true; if we can calculate a quantity correctly, it is not necessary for physics that we 
should find a separate physical interpretation for every term contained in it. Ac¬ 
cordingly, we shall define a vector by the transformation property of its components 
A { , which includes the statement that the component in a direction with direction 
cosines n { is n { A { . A scalar is a single quantity, the same for all axes. 
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2*03. Single-letter notation. A vector with components A^ can be still more shortly 
denoted by A in heavy type. It is not really necessary, and is in fact rather difficult, to 
define what we mean by a vector apart from its components. In making comparisons with 
observation it is the components that we are concerned with, but it is often convenient 
to work with the single symbol. We can then define the sum of two vectors A and B to 
be the vector whose components are A i + B i and denote it by A+B. -B is the vector 
whose components are -B 4 ; and A—B = A + (-B). We can define the product of a 
vector A by any scalar m to be the vector whose components are mA t . It is obvious that 
with this definition vectors satisfy the commutative and associative rules of addition; 
B+A is the vector whose components are ^ + = A^B^ and these are the com¬ 
ponents of A +B. Similarly, the associative rule 

A + (B+C) = (A+B) + C 

follows at once from the definition. We cannot significantly add a vector to a scalar, 
since the former is altered on change of axes and the scalar is not; but clearly mA and Am 
represent the same vector, and the commutative law of multiplication is satisfied. 
Similarly if m and n are scalars 


(m + n)A = mA+nA = (n + m)A, 

provided m and n have the same dimensions; otherwise addition is meaningless. Also 

m(nA) = (mn)A. 

The displacement from P to Q, considered as a vector, will be denoted by PQ. 

We have seen that a vector is completely specified by three components; but it is easy to see that 
this condition can be satisfied without the commutative law of addition being satisfied and therefore 
is not by itself a sufficient condition for the vector property. 

Consider the rotation of a rigid body through a finite angle 
about any axis through a fixed point O of the body. The 
natural ‘sum’ of two successive rotations is a single rotation, 
which would bring the body into the same final configuration. 

A single rotation is completely specified by the direction of 
the axis of rotation, which requires two data to specify it, and 
the angle turned through. Let us take a set of rectangular axes 
0123, fixed in space, and consider two successive rotations, 
first through \v about 01, then through \tt about 02, both 
being right-handed rotations. If we take them in this order a 
point P with coordinates (0,0,1) goes first to P'(0, - 1, 0) and 
remains there at the second rotation. But if we make the 
rotation about 02 first, the point goes to P"(l, 0, 0) and 
is undisplaced by the second rotation. The order of the 
rotations affects the result. Now if the sum of the two rotations was obtained by the vector law this 
could not be so, since the commutative law of addition holds for the latter. The representation of 
finite rotations will be considered more fully in the next chapter. 

2*031. The geometrical representation of a vector can be shown to be possible from our 
definition. For if A is any vector we can multiply its components A i by a constant c 
chosen so as to make them lengths. Then if x { = cA if the projection of® on the direction 

h x i = diAi, 

and dividing by c we recover the component of A in the direction The addition of 
vectors of the same dimensions is then seen to be completely expressed by representing 
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them by displacements on the same scale, and the parallelogram law is thus recovered 
for the general vector. Further properties of vectors can be inferred from those of dis¬ 
placements by this representation. In particular, any vector has a modulus and a direc¬ 
tion; for if r is the length of the displacement representing it on a given scale, we have 

r 2 = x\ + x\ + x\ = c 2 (Af + A$ + A$) = c 2 A 2 , (1) 

where A 2 = A { A { , (2) 

and is independent of the choice of axes. Then A (taken positive) can be called the modulus 
of A , and corresponds to the distance in a displacement. Also if A + 0, and if we write 

xjr = A^A = m i> 

the mi are the direction cosines of a definite line, which can be called the direction of A. 
It follows further that the projection on a line in the direction Z i of the representative 
displacement is c^4Z i m i = cA cos 6, where 0 is the angle between the directions of Z i and 
m i ; and the component of A in the direction Z* is A cos 6. But the component in any given 
direction is independent of the axes of reference and therefore A and 6, for all l i} are 
independent of the axes. Hence A is the same magnitude and m t the same direction, 
whatever the axes. 


It is convenient in one case to depart from the rule that we take A positive. A straight line can be 
represented by the equations 

Xi = di + sl t . 

If we keep the same, we can get all points on the line by allowing 8 to range from — oo to oo, and this 
corresponds to proceeding along the line in a definite direction. If l t is assigned it is therefore convenient 
to take for a line through the origin 

x t = xl t , 

thus admitting negative values of the quantity x. The distance from the origin, taken positive, will 
always be denoted by r. When we take 

Xi = rl t 

with r positive we are considering two lines in opposite directions from the origin as different lines, with 
equal and opposite values of l { . This is sometimes convenient, but not always. 

2*032. Comparison of notations. The importance and use of vector notation is a 
matter of debate among mathematical physicists. Anything that can be said by means 
of A can be said by means of A { or by writing out the components in full. If, however, the 
geometrical representation of a vector by a directed line segment is constantly borne in 
mind, to some minds the content of many physical laws is most clearly understood in 
vector notation. A little trouble is needed to learn to ‘think in vectors’. A little is also 
required to acquire confidence that the compactness achieved by the summation con¬ 
vention does not lead to mistakes. It must, moreover, be remembered that if a physical 
result about a vector is obtained, it will always take three measurements to verify it, 
and the three components will have to be unpacked, whether it is expressed in vector or 
suffix notation. The unpacking from suffix notation is often easier and never harder than 
from vector notation. Some general theorems are more compactly expressed in the one, 
some in the other. In special problems judgment is needed to decide on the best moment 
for unpacking, and many students defer it too long when the conditions of the problem 
indicate one or two special directions. In elasticity and the dynamics of viscous fluids 
the suffix notation adapts itself far more naturally, and in the theory of relativity vector 
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Null and direction vectors 

notation breaks down completely because the parallelogram law fails for velocities. Con¬ 
sequently some mathematical physicists hold that vector notation is pure waste of time 
and delays the acquisition of familiarity with the more generally useful method. What 
we shall do in the present chapter is to show the two methods, as far as possible, side by 
side. Ability to translate from either language to the other, or to the expanded form, is 
an absolute necessity in understanding modern physical literature. It is often useful to 
visualize a vector as a displacement vector, and while as a matter of de fini tion we make 
a clear distinction between a general vector and a displacement vector, we shall frequently 
speak of a general vector in geometrical terms: e.g. the angle between two vectors A and 
B means strictly ‘the angle between the displacement vectors representing A and B 
according to some specified scale’; ‘two perpendicular vectors A and B 1 means ‘two 
vectors A and B such that the displacements representing them are perpendicular’. 
The use of this analogy is unnecessary in suffix notation, analytical definitions being 
provided. 

2*033. Null vector. A null or zero vector is one whose modulus is zero. 

2*034. Direction vectors. A vector of modulus 1 (a number) in the direction of a 
vector A is called a unit or direction vector in that direction. Its components are evidently 
l t , the direction cosines of the direction of A with regard to the coordinate axes. In 
particular we shall denote direction vectors in the directions of the axes by e M , e (9 \, e«>\ 
respectively; that is, 

e d) = (1,0,0), e (2) = (0,1,0), e( 3 ) = (0,0,1). 

The use of the brackets round the suffix is to emphasize that it does not denote a com¬ 
ponent, but a particular vector. Any vector A may be written as 

^i e d)+^ 2 e ( 2 )+-4 3 e (3 ). 

Some books denote direction vectors parallel to the axes by i,j, k and write 

A = A x i+A y j+A z k. 

2*04. Linearly dependent or coplanar vectors. If there is a relation 

aA+fiB+yC = 0 ( 1 ) 

between three vectors, where a, ft, y are real numbers (not all zero), then A, B, C are said 
to be linearly dependent. Geometrically this means that A, B, C can be represented by 
displacement vectors lying in a plane, since if y 4= 0, 

C = -~(ccA+fiB), (2) 

and therefore C is represented by a displacement vector lying in the plane of the dis¬ 
placements representing A and B, supposing these drawn through the same point. The 
vectors themselves are then said to be coplanar. We may recall at t his point that parallel 
vectors of equal magnitude are equivalent in the system we are using. Some writers 
distinguish between ‘free vectors’ and ‘localized vectors’, a localized vector including, 
for instance, the specification both of a force and of its point of application. In our sense 
a localized vector is not a vector but two vectors, one to specify the force and the other to 
specify the point of application. 
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If there is no such relation as (1) the three vectors are said to be linearly independent 
or non-coplanar. 

The corresponding results in suffix notation can be written down at once. 

2*041. Expression of any vector in terms of three non-coplanar vectors. If 

A , B, C are any three non-coplanar vectors and D is any vector, then D can be expressed 
as clA+ fiB + yC, where a, /?, y are real quantities. For, let PQ represent D. Then lines 
through P representing A and B define a plane. Let RQ be a line through Q in the direction 
of C and meeting the plane defined by A and Bin R. Then 

PQ = PR+RQ. 

But RQ represents yC, where y is a scalar; and PR 
is a vector coplanar with A and B, and can therefore 
be expressed as ocA+fiB. Hence 

D = aA+fiB + yC. 

Since there is always a relation of this type between P 
four vectors it follows that four vectors cannot be 
linearly independent. From the construction given it 
is clear that for given A, B, C (non-coplanar) and any D, the quantities a, /?, y are 
uniquely determined. 

When A, B, C are direction vectors, mutually perpendicular to one another, we have 
the particular case given by 2-034. If A, B, C are not mutually perpendicular, otA,fiB, yC 
are the oblique components of D in the directions of A, B, C respectively. 

2*05. Multiplication of vectors. We have considered multiplication of a vector 
by a scalar, but the meaning, if any, to be assigned to multiplication of two vectors is 
not immediately obvious. We can set out the nine products of the components in a 
square array, thus: 


A 1 B 1 

A x B 2 

A x B 3 

A 2 B^ 

A 3 B 2 

a 2 b 3 

A 3 B 1 

a s b 2 

A 3 B s 


and we shall find that these products all reappear in Chapter 3. The two products called 
the scalar and vector products, which we now proceed to define, are particular combinations 
of these nine products, and their choice is dictated by their usefulness in physical applica¬ 
tions. 

2*06. Scalar product. This function is directly related to the fundamental and 
experimentally verifiable relation 2-01(1) between distances not measured along the 
same line. We have 

PQ . PR cos 6 = £(P£ 2 + PR 2 - £P 2 ), (1) 

and this is completely determined by the three distances PQ, PR, QR. Since distance is 
the fundamental notion of the whole subject and is the same for every frame of reference, 
this expression is a single quantity whose value is independent of the coordinate system: 
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we have called such quantities scalars. Now if we translate into Cartesian coordinates, 
if x i denotes PQ, and denotes PR, then y i — x i denotes QR, and 

%(PQ 2 + PR 2 - QR 2 ) = \[{x\ + x\ + x 2 ) + (y\ + y 2 + y$)\ 

- {{Vi - x if + {y% - + iVz - x z) 2 }], (2) 

= x 1 y 1 + x 2 y 2 +a: 3 y 3 . (3) 

This is called the scalar product of the vectors x and y, and in general the scalar product 
of A and B is defined by 

A.B = A 1 B 1 + A 2 B 2 +A 3 B 3 = 2 = AiBj, (4) 

i= 1,2,3 

using the summation convention. d.Bis equal to AB cos d, where 6 is the angle between 
the directions of the two vectors. We read it as ‘ J. doti?\ The two expressions in 
coordinates in (2), (3) would be written by the summation convention as 

\{ x i x i+yiVi- iyi~ x i) {Vi~ x i)} = x iVi> ( 5 ) 

and the left side can be further shortened to 

\{ x \ + y\-{yi~ x i)% ( 6 ) 

We recall that x 2 would formerly have been written xx, which is what we mean by x 2 \ 
so when we see an expression like x\ we interpret it as x i x i and apply the summation 
convention. In terms of this convention the modulus A of a vector A is given by 

A 2 = A\. 

This expression is the scalar product of A with itself. The saving of writing by the sum - 
mation convention is enormous, so great that on the rare occasions when we do not use 
it we say so specially. Without it, suffix notation would have little advantage over writing 
everything out fully in Cartesian coordinates; with it, expressions that when fully 
developed would contain 9 or 81 terms can be written down and handled as easily as one. 
The convention remains useful in the theory of relativity and in general dynamics, 
since it is simply a linguistic device and does not depend on the parallelogram law. 

The proof that A t B i is independent of the axes of reference, without the use of the 
displacement representation, is 

A'jB'j = l ij l kj A i B k — 8 ik A i B k = A i B i , (7) 

just as in deriving 2-023 (16). 

Commutative law. It is clear from the definition that the order of A and B in the scalar 
product is irrelevant: A.B = B.A. 

Associative law. A. B is not a vector, and we cannot go on to form a product A.B.C, 
so the associative law has no meaning. (A.B)C means a vector in the direction of C and 
of A.B times its magnitude. 

Distributive law. We can, however, prove that 

A.{B + C) = A.B+A.C, 
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for this follows at once from the definition. This can be immediately extended so that the 
scalar product of two sums of vectors can be expanded into a sum of sealar products. 
In particular, since 

®(2)*®(3) = ®(3)*®(l) == ®(l) • ®(2) == 
e (l)* e (l) ~ e (2) • e (2) = C (3) * e (3) = 1> 

A.B = (-4 !©(!)+ A 2 6(2) +^ 3 e(3)). (-BiC(i)+-B 2 e (2) + -®3 e (3>) 

We thus recover the definition (4). 

Notice that when direction vectors along the axes are introduced the rectangular 
coordinates are regarded as scalars. 

It should be noticed that if the scalar product of two displacements is zero it implies 
PQ 2 + PR 2 = QR iw , that is, either one displacement is zero or they are perpendicular. 
The vanishing of A .B does not imply that either A or B is a null vector; it implies that 
either they are perpendicular or one of them is a null vector. But if A.B X , A.B%, A.B Z 
are all zero, where B x , B 2 , B 3 are not coplanar, A cannot be perpendicular to all and must 
be null. 

The cosine of the angle between two lines is the scalar product of two direction vectors 
along the lines; that is, if l i} m i are the direction cosines of the lines 

cos# = 

The component of a vector A along a line with direction cosines l t is liA it which is the 
scalar product of A and a direction vector along the line. 

2*07, Vector product. The vector product of two vectors is written AaB (read, 
A cross B) and is defined as follows by the displacement model. If OP and OQ, representing 
A and B respectively, are not parallel they define a plane. Let OR represent a direction 
vect or n perpendicular to this plane. Then if Q is the angle turned through from OP to 
OQ, right-handedly about OR, 

AaB — AB aindn. 

It will be seen that the definition is independent of the choice of the direction of n. If 
we took the opposite direction as n the angle turned through right-handedly about it 
would be 2tt — 6, and since sin (2tt—0) — — sin 6 the result for the magnitude and direction 
of the vector product would be the same. 

In the vector product the order of the factors matters. For 

BaA = JL4sin( — 6)n = — AaB. (1) 

The commutative law is not satisfied by the vector product. We shall soon see that the 
associative law is also not satisfied, but we can prove that the distributive law holds,, 
namely, 

Aa(B + C) = AaB+AaC. (2) 

We consider first two particular cases. 

(1) A is perpendicular to B and C. We notice that if A and B are perpendicular, them 
A a B is obtained from B by first multiplying B by A and then rotating it about A through 
a right angle. Hence the vectors Aa(B + C),A aB, AaC are represented by line segments. 
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Properties of vector product 

of lengths equal to A times the sides of the triangle representing B + C, B, C respectively, 
and each is turned through a right angle. It follows that the vectors Aa(B + C), A aB, 
AaC are represented by the sides of a triangle similar to that representing B + C,B, C, 
and hence equation ( 2 ) follows for this case. 

( 2 ) A, B, C are coplanar. Then Aa(B+C), AaB, AaC are all in a direction per¬ 
pendicular to the plane and the result follows from the addition formula for sines. 

In the general case we assume A and B not coincident or perpendicular and write B 
as B p +B n , where B p is parallel to A and B n perpendicular to A in the plane of A and B. 
Then since B n = B sin 6, it follows that 

AaB — AAB n . 

If, similarly, C — C p + C n , then AaC = AaC n . But 
B + C = ( B p + C p ) + (B n + C n ), 

and since B p} C p are parallel to A and B n , C n perpendicular 
to it, it follows that B n + C n is the component of B + C per¬ 
pendicular to A, and therefore 

A a (B + C) = Aa (B n + C n ). (3) 

But by Case 1 this is equal to 

A kB n +A a C n — A aB+A a C. (4) 

The vector product of a vector with itself or with any parallel vector is a null vector. 
If the vector product of any two vectors vanishes, it can be inferred that either they are 
parallel or one of them is a null vector. Again, it does not follow from the fact that the 
vector product is the null vector that one of the factors is null. But if A aB x and AaB 2 
are null, and B x , B 2 are not parallel, A must be null. 

In particular, for the vectors e^, e^, we have 

e (l) A e (l) — e < 2 ) A e ( 2 ) — e (3) A e (3) = (5) 

6(2) A 6(3) = e x = -6(3)AC(2), (6) 

so that 

AaB = (A x e {x) + A 2 e (2) + A^^)) A(B x e (x) + B 2 e^ 2) + B z e {z) ) 

— (A 2 B 3 — A 2 B^)€'{ : fi+ (A Z B X — A x B^) 6(2) + (A X B 2 — A 2 .B 1 )e(3j 

6 ( 1 ) 6 ( 2 ) 6 ( 3 ) |. (7) 

A i A 2 A z 
B x B 2 B a 

A geometrical interpretation is available for the vector product of two displacements. 
If P is x i and Q is y it the projections on the plane x x = 0 are P 1 (0, x 2 , x 3 ), Q x (0, y 2 , y z ), 
and the area of the triangle made by these two points and the origin is ^(x 2 y 3 — x z y 2 ). 
If we rotate OP x positively about Ox x the area is positive if OQ x is reached after a rotation 
less than tt. It will be found that taking double the three projections in turn we have 




x %y% x zV\ x \y* x iy*- x *yv 


( 8 ) 





2*071 Analytic treatment of vector product 69 

But by a theorem of geometry these are equal to twice the magnitude of the area of 
the triangle OPQ multiplied by the three direction cosines of the normal to its plane, 
taken on the side such that the rotation of OP to OQ is in the positive sense about 
the normal and less than n. They are therefore the components of the vector 
product x a y. 

V ector arm. In the definition of the vector product Xky the sense taken for n is irrelevant, 
but the right-handed rotation about n needed to bring x into the direction of y may have 
any value less than 2 tt. The statements in the last paragraph would remain true if the 
rotations were all taken about lines in the opposite directions to those stated and a com¬ 
ponent taken positive if the rotation about the negative direction of the corresponding 
axis is between n and 2n. But the signs of all components are reversed if a? and y are inter¬ 
changed. There are advantages in being able to speak of the triangle OPQ as having the 
same directed area irrespective of the labels attached to its sides; and this can be done by 
defining its vector arm as 

|| xysind\n = £ \x\y |n, 

with a particular choice of the sense of n. It will be equal to the vector product if sin 0 
is positive, that is, if the rotation from OP to OQ in the positive sense about n is less 
than 7r; its sign will be reversed if n is reversed. 

To make the vector area unambiguous we need a criterion for identifying the sense ofn. 
This arises most simply in relation to a surface made up of triangles. By addition we can 
define a vector area for the whole surface, n being defined so that it does not cut through 
the surface when we pass from one face* to an adjacent one. In particular, for a closed 
polyhedron, we can take n to be always outwards. In this case the faces with positive n ± 
will have vector areas whose components are the areas of the projections of the faces on 
the plane <923 and make up the area contained in the rim of the projection. Those with 
negative n x give components whose total is the same area with the sign changed. Hence 
the vector area of a closed polyhedron is zero. 

2*071. Analytic treatment of vector product: e ikm . In the analytic treatment 
we define the vector product directly as a vector with components 

(A 2 B 2 A 3 B 2 , A s B x A x B 3 , A x B 2 — A 2 B X ). 

It then requires proof that this set of quantities has the proper transformation properties 
and that its components are equal to AB sin dn^. The reason for introducing the vector 
product at all is that this set of quantities arises naturally in the discussion of the equa¬ 
tions of dynamics, especially the motion of a rigid body and the motion of a charge under 
magnetic force, and in electromagnetic theory. 

We consider the set of 27 numbers e ikm specified by the rules (1) if any two of the i, k, m 
are equal, e ikm = 0; (2) if they are all different and occur in succession in the order 12312 ... 
which we call even, e ikm = 1; (3) if they are all different and occur in the order 21321... 
which we call odd * e ikm = -1. That is, 

e l23 = e 231 = e 312 = 1» (1) 

e 213 = e i32 = e 321 = — 1> (2) 

* The reason for the terms even and odd is derived from the number of interchanges of suffixes 
needed to produce the order 123. Thus 231 can, by interchanging two suffixes once, be turned into 
132 and by a second interchange to 123. But 213 is turned into 123 by a single interchange. 
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while any of e ul , e 112 , e 232 and so on are zero. Now consider the sum 

e ikm^-k^m' (3) 


Here k and m are both repeated suffixes. For each i it is therefore the sum of nine terms. 
But if either k = i, m = i, or k = m, e ikm = 0. The only terms that can differ from 0 are 
therefore the two that have k and m different from i and from each other. Thus if, for 
instance, i = 1, we must have k = 2 and m = 3 and therefore e ikm = 1, or k — 3, m = 2, 


'ikm 


= — 1. Hence 


e lkm^k^m ~ ^ 2-^3 




3 ^ 2 * 


(4) 


and similarly for i — 2 and 3 we find the other two components of the vector product. 
Thus (3) gives a compact expression for the components of the vector product. We shall 
denote them at present by (AaB^ to facilitate comparison with results already obtained 
in vector notation. 

Two other important properties of e ikm are as follows. Clearly whatever A i may be, 

e ikmAk-A- m ~ 0 , ( 5 ) 


since all the terms cancel; this formulates analytically the statement that the vector 
product of a vector with itself or any parallel vector is null. If A { , B it C t are any sets of three 
quantities, 


G m 


A x A 2 A 3 
B x B 2 B 3 
Cl C 2 C 3 


( 6 ) 


the determinant formed by the nine components. If it vanishes there are values a, ft, y 

such that _ ^ . 

0 LA i +fiB i + yG i = 0 (7) 


for all i, so that the vanishing of ( 6 ) is the condition for vectors A,B } Cto be eoplanar. 
If the determinant does not vanish, the equations 

ctAi + pBi + yQ = D t ( 8 ) 


have a unique solution for any D, and we recover the result that any vector can he expressed 
linearly in terms of any three non-coplanar vectors. 

We now proceed to prove analytically that e ikm A k B m are the components of a vector; 
this proof is quite independent of the argument in 2*07. 

2*072. Transformation property of vector product. Take any pair of lines with 
direction cosines l iy m i . The conditions that a line with direction cosines n x shall be per¬ 
pendicular to both are, written in full, 

l x n x + l 2 n 2 +l 3 n 3 *— 0, m x n x + m 2 n 2 + m 3 n 3 = 0, (1) 


whence 


n i 


n 0 


n « 


l 2 m 3 l 3 m 2 l 3 m x ~~l x m 3 l x m 2 l 2 m x 


(nf 4- n\ + n%) lh 


{{l 2 m 3 - 1 3 m 2 ) 2 + (l 3 m x - l x m 3 ) 2 + (Z 1 m 2 - 1 2 Wj) 2 }^* 
But the sum of squares in the denominator is, by Lagrange’s identity, 

(l{ + 11 +1\) (ml + m\ + ml) - (l x m x + l 2 m 2 + l 3 m 3 ) 2 = 1 - cos 2 6 - sin 2 0, 


( 2 ) 


( 3 ) 
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where 0 is the angle between the lines l { , m £ . Also since n £ are direction cosines the numer¬ 
ator in the last expression in (2) is ± 1 . Hence each of these expressions is equal to ± cosec 0, 
and 

= ± C0SeG 6 e ikmh m m- (4) 

The ambiguity of sign corresponds to a choice of direction of travel along the line n £ . 
If we take it so that the direction of rotation from l £ to m i through the angle 6 is right- 
handed about n i we see from consideration of 

h = ( 1 , 0 , 0 ), ra* = (cos 0, sin#, 0 ), n £ = ( 0 , 0 , 1 ) 

that the positive sign must be taken. Hence 

n i = coaec0e ikm l k rn m . (5) 

Now if If and nif are given directions, the perpendicular to them is in a fixed direction 
independent of the axes; hence n i} being the direction cosines of a fixed direction, trans¬ 
form according to the vector rule. 

For two general vectors A and B we can now define two directions l £ , rrif by A £ = Al £ , 
Bf = Brrif, and then AB sin 0 is a scalar since A, B, and 0 are all independent of the axes. 
Hence AB sin 0n £ is a vector. But 


ABsmOnf = ABe ikm l k rn m = e ikm A k B m , 


( 6 ) 


which proves that for two general vectors the components of the vector product transform 
according to the vector rule and that their values are equal to those given by the dis¬ 
placement definition. 

In suffix notation 2-07 ( 1 ) and ( 2 ) are obvious, and 2-07 (7) does not arise because we 
need never consider direction vectors along the axes. 


2*073. Relations between the l £j . We can now proceed to show that the relations 
2*022 (9) do follow from 2*021 (7). This is really a consistency theorem. If it was not true 
there would be more than six independent relations between the nine direction cosines 
involved in a transformation of axes, and not more than two elements of the transformation 
could be assigned independently. But apparently we can rotate the axes by an arbitrary 
amount about any fine, and this line itself needs two parameters to specify its direction, 
making three in all; and by the properties of rigid bodies the frame of the axes will remain 
rectangular. Hence we really have already all the information required to justify the 
statement that 2*022 (9) must be a consequence of 2*021 (7). But the metrical relations 
assumed in the argument might conceivably be mutually inconsistent, and a direct proof 
is desirable as a check. We take the above l £ to be the hi of the transformation, and rrif to 
be l i2 . Then since OS' is perpendicular to OV and 02', and the rotation from 01' to 02' is 
taken right-handedly about OS' through a right angle, sin# = 1, and 


iiZ e ikml'kl^'m2’ 

Similarly, l £1 — e £km l k % Z m3 , Z ^ 2 = e ifcwi^fe 3 ^mi* 

Now this is the same as saying that in the determinant 


( 1 ) 

( 2 ) 


L = 


'11 

*12 

ns 

'21 

^22 

hs 

'81 

<N 

CO 

hs 


(3) 
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every element is equal to its cofactor. For each of the relations (1), (2) asserts this for all 
the elements of a column of the determinant. If we expand in terms of the elements of the 
first column (j= 1) we therefore get If i> which is 1. Hence L = 1. But if we expand in 
terms of elements of the first row (i = 1) we get l\p which must therefore also be 1; 
similarly, 

1% = 1% = 1. (4) 

On the other hand, if we form where i and k are unequal, we get a determinant with 

two rows equal and therefore zero. Hence for all i, k 


hjhj ~ $ik‘ (®) 

The relations (1), (2) are in the form needed for the proof of the theorem, but are stated 
as three separate equations. Their similarity suggests that they can be written as one; 
this is 

€jln 1%] ~ ^ikm^kl^mn' (®) 

The only suffixes not repeated on either side are i, l, n. (a) If n follows l in the order 1231, 
the only value of? that makes e jln different from 0 is the predecessor of l, and then e jln = 1. 
Hence in this case the left side reduces to l tj , where j =f= l, n, and this is equal to the right 
side by (1), (2). (6) If n precedes l in the order 1231, the left side is 


^ij (j ^ ^ikm^kp^mst 


wher e jps are consecutive in the order 12312; and therefore p = n, s — l, and the right 
side is 

^ikrrJ'krJ'ml ~ ^imkJ'mrJ'Td 
— e ikrrJ'kJ'mn' 

( c) If l = n, both sides are unaltered if l and n are interchanged; but this interchange 
reverses both sides, which are therefore zero. Hence (6) is true for all values of l, n. 

By multiplying (6) by l pl we get 


= ® iknJ'kJ'mrJ'pl 
~ ^ikm^mn^kp 

~ GiprrJ'mm 


and putting k for p we have 

p 7 

'-’ikm mn 

~ 6 jin hj ^kl' 

If a determinant is written as 

A = 

Ai 

^12 

-^13 



A 21 

^22 

-^23 



^31 

^32 

^33 


the first suffix referring to the row and the second to the column, it is 


(7) 

( 8 ) 




— €jln 2j -^31 "4 In ^jln 3j ^ll^2n 

= — e jin,A 2 jAuA an = etc., 


^ikm^ — 6jln-^-ijAj c iA.j nn . 


'jln J 


and hence 


(9) 
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Thus a determinant whose elements are all identified by row and column suffixes can be 
written as an expression in one line by means of the e notation. This can be extended to 
determinants of any order.* 

2*074. The numbers e ikm e p8m . One of the most important properties of e ikm is an 
identity satisfied by the 81 numbers 

^ikm^pam' 

Here we have of course to sum with regard to m, but each of i, k , p, s leads to a separate 
expression according as it is taken to be 1, 2, or 3. As each of these four suffixes is capable 
of only three values, at least two of them must be the same in each component. Evidently 
all components with i = koip = s are zero. If i =|= k, there is only one value of m that makes 
e ikm different from zero, and then the only values of p and s that make e psm different from 
zero are i and k, in either order. If the orders are the same, e ikm and e psm are either both 1 
or both — 1, and the product is 1. If the orders are different the product is — 1. Hence 

® ikm e pam = 0 (i = k) 

= 0 (P = 8) 

= 1 (i = p, k = s) 

= — 1 (i — s, k = p) 

= 0 (i#=pors, or k^pors). 

Now consider the set of numbers 

^ip &ka ~ ^ia^kp* 

If i = k or p = 8 the components cancel. If i or s, or k or s, one factor of each term 
is 0. Hence the only non-zero components are those with i, k equal to p, s, in either order, 
and the members of each pair themselves unequal. But if i = p and k = s, the first term 
is 1 and the second 0, and if i = s and k=p the first is 0 and the second 1. Hence for every 
possible assignment of the four non-repeated suffixes 

^ikm^psm = ^ip^ks ^is^kp’ (^) 

We shall meet this identity again and again in different applications. 

Since ^psm = ^mps = ® 8mp » (^) 

for all assignments of the letters, and similarly for e ikm , the values of the expression on 
the left of (1) will not be altered by replacing ikm by kmi or mik. We can therefore provide 
a general rule for the signs: take i and p to be the suffixes that follow the repeated suffix in 
the respective e (if the repeated suffix is the last, take the respective i or p to be the first) 
factors; then 8 ip appears with the positive sign, and the rest of the formula can be filled in by 
symmetry. 

2*08. Division of vectors. This cannot be defined without ambiguity and is avoided. 
It is easily seen that, given a non-zero vector A and a scalar M, there is a vector B such that 
the scalar product A . B - M . But the division is not unique, because we could add to B 
any vector perpendicular to A without affecting the scalar product. In general there is 
no vector B such that the vector product AaB = C, where A and C are given vectors. 


* Cf. Durell and Hobson, Advanced Algebra , 1937, Chapter 16. 
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For AaB is perpendicular to A, and if C is not perpendicular to A there is no vector B 
that satisfies the conditions. If C, on the other hand, is perpendicular to A we could add 
toB any vector parallel to A without affecting the vector product, and the quotient would 
be ambiguous. 

In the field of quaternions, which is an extension of vector algebra, a quotient does 
in general exist. Few physicists had used the quaternion method until recently, but it is 
now receiving some notice in quantum theory. (Cf. Chapter 4, Ex. 12.) 

2*09. Triple products. The scalar product of B a C with another vector A,A.(BaC) 
is called the triple scalar product of A, B and C. We shall show that in such a product the 
order of the factors is immaterial so long as the cyclic order is preserved, and the dot and 
the cross may be interchanged without altering the value. There are thus six possible 
ways of writing it. 

We can also form the vector product A a (B a C) of a vector A with the vector product 
of B and C. 

We first examine some special cases: 

(i) B.(BaC). Clearly 

B.(BaC) = 0, (1) 

since B a C is perpendicular to B. 

(ii) Ba (BaC). Ba (Ba C) is perpendicular both to B and to BaC and hence is in 

a direction perpendicular to B in the plane of B and C, obtained by rotating BaC 
right-handedly about B through a right angle. Its magni¬ 
tude is B times that of BaC, i.e. it is B 2 C sin 6. From BaC^ 

the figure it therefore follows that 

Ba(BaC) = B 2 C&mO cot 0^ — B 2 C sin 6 cosec 6^. (2) 

x> 0 

= (B.C)B-J3 2 C. (3) 

Similarly, 

Ca(BaC) =-Ca(CaB) = C 2 B-(B.C)C. (4) 

(iii) (BaC). (B a C) = B 2 C 2 - (B. C) 2 . (5) 

This follows immediately from the definitions. 

2*091. The triple scalar product A.(BaC). From the commutative law for the 


scalar product we have 

A. (BaC) = (BaC). A. (6) 

Further, we can write any vector A as 

A = aB + /?C -f yB a C, (7) 

since B, C, B a C are not coplanar if none of them is null. Then 

JL (BaC) = y(B a C) 2 = y[B 2 C 2 -(B.C) 2 ]. (8) 

Also AaB = j5CaB-yBa(BaC) 

= /?C aB + yB 2 C—y(B. C) B, (9) 

C.(AaB) = y[J5 2 C 2 —(B.C) 2 ] = 4. (BaC). (10) 



so that 
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Similarly, B.(C aA) = A.(BaC). (11) 

Hence altogether we have six equal products 

A.BaC = B.CaA = C.AaB = AaB.C =BaC.A = CaA.B. (12) 


Brackets are unnecessary, since the order of forming the vector and scalar products is 
unambiguous. If the cyclic order is A, C, B, instead of 
A, B , C, the sign is changed. 

Geometrical meaning. Since the modulus of BaC is 
the measure of the area of the parallelogram formed 
by line segments representing B and C it follows that 
if the angle between A and BaC is acute, then A.BaC A 
is the measure of the volume of the parallelepiped 
formed by line segments representing A, B and C. If 
the angle is obtuse then —A.BaC is equal to this 
volume. 

The triple scalar product A.BaC is sometimes written [A, B,C], 

2*092. The triple vector product A a (BaC). As before we write 

A = aB+fiC + yBAC. (13) 

Then Aa(BaC) = aBA(BAC)+pCA(BAC) 

= a[(B.C)B-B*C]+j3[C 2 B-(B.C)C] 

= K<xB+J3C).C]B-[(aB+0C).B]C 
= (A.C)B-(A.B)C. (14) 

Similarly, it may be shown that 

(AaB)aC = (A.C)B-(B.C)A. (15) 



It should be noticed (1) that the associative law does not hold, (2) as an aid to remem¬ 
bering the signs, the term on the right-hand side in which the middle vector of the left- 
hand side occurs in the scalar product has the negative sign. 


2*093. In suffix notation the equivalence of the various forms of the triple scalar 
product is obvious. For 


A-i C m 


-4x A% At 


Co 


B 3 

a 


and the forms A.(BaC), B.(CaA), C.(AaB) represent the expansions of this deter¬ 
minant in terms of the elements of different rows. The other three are expressions of the 
relation 

2-094. The expression for the triple vector product depends on the identity 2-074 (1) 
satisfied by the e ikm . We notice first that the use of the summation convention requires 
that any repeated suffix must occur only twice, otherwise there will be an ambiguity 
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about the order of summation. We must therefore write the i component of the triple 
vector product A a (B a C) as 

e ikm-^-k e mpaDpG a = G ikm e m psAkD p C s 

= (8 ip 8k S — 8 is 8 k p)A k B p C a 

= ^kaA-k B { C s — 8 kp A k B p C i 
— BiA k C k — G i A k B k , 

and 4 a(BaC) = (A.C)B~(A.B)C. 

Also (AaB)aC = -Ca(AaB) = -( C.B)A + (C.A)B . 

Mathematical physicists differ widely with respect to their ability to remember these 
formulae; but the formulae can always be recovered in a few lines from the identity 
2-074 (1), which is much less difficult for the memory and has other applications. 

2-10. Vector functions of a scalar variable. Differentiation. We shall denote a 
general scalar variable by t and let A(t) be a general vector function. We define the dif¬ 
ferential coefficient of A(t) with respect to £ in the following way. Consider the ratio 

8A _A{t + 8t)-A(t) 

8t ~ 8t 

If 8t is any non-zero quantity it is clear that 8A/8t is a vector, and if as 8t->0 8A/8t tends 
to a limit, we define this limit as the vector dA/dt. A formal proof that the limit is itself 
a vector depends simply on the theorem that the sum of the limits of two functions is 
equal to the limit of their sum. 

The components of the vector dA/dt are (dAJdt, dAJdt, dAJdt). It is important to 
notice that not only is the modulus of dA/dt in general different from that of A, but also 
its direction. In particular the differential coefficient of a vector of constant modulus, 
whose direction varies, is not zero. 

Differentiation of products. The rule for differentiating a product of two scalar functions 
is easily extended to differentiation of the product of a scalar with a vector function and 
to scalar and vector products. We shall simply state the results here; the proof is in every 
case straightforward and is left to the reader. 

(1) If a is any scalar function of t, then 


d da. . dA 

^> = ■3 t A+a lu' 


d . da . dAi 

^ = Tt A ‘ + *nr- 


dt ' ' dt~~ ' " dt ’ dt 

(2) If A andB are two vector functions of t then 

d dA n dB 

— (A B■) — B | a dBj 
dt KAi±5i) ~ ** dt +Ai ^r 

The order in the products is here immaterial. 


( 3 ) 


d . - n . dA n . dB 
^(AaB) = ^aB+4a-^-, 


d 

dt 6ikm 


AkB m 



B m + e ikmA k 


dt 
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The order in the products must here be maintained in the vector notation. In suffix 
notation rearrangement of the factors is permissible. 

Some special results arising from these are important: 

(1) If A is a vector of constant modulus then 


i.e. 




. dA „ 

A -n - °- 


which shows that dAjdt is perpendicular to A. Translation into suffix notation is im¬ 
mediate. 

This can be seen geometrically. In the figure the lines representing A and A + SA are 
of the same length. As Q approaches P the angle between OP 
and PQ approaches a right angle. 

(2) If any vector function A is written as the product of its 
modulus A with the direction vector n, then 

dA _dA . dn 
~dt ~~dt n + A di' 

That fa, if A t = Al t , 


Q 



It should be noticed that 


dA. 

dt 


is not in general equal to 


I dA 
I dt 


2*11. Motion of a particle under gravity with resistance varying as the velocity. 
Let the origin 0 be at the point of projection. The resistance is assumed to act along the 
tangent to the path in the opposite direction to that of motion and is therefore expressed 
by a force vector — m/cv, where m is the mass of the particle and k is a constant. Let k 
be a direction vector in the direction of the upward vertical. Then equating the mass 
times acceleration of the particle to the force acting on it we have 


mx = — mgk — mKX 


or x + KX = —gk. 

We can integrate this vector equation as it stands. It may be written 


d_ 

dt 


(xe*) = —ge Kl k, 


so that £e Kt = - - e Kt k + V+-k, 

K K 

if V is the velocity of projection from O at time t — 0 . Hence 

x = e~ Kt V- g -{l -e-^k 




( 1 ) 

( 2 ) 

( 3 ) 

( 4 ) 

( 5 ) 


and 

since x = 0 when t = 0. 


( 6 ) 
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It can be seen immediately from this equation that at time t all particles projected with 
speed V from 0 lie on a circle whose centre C is at a depth — — ~ (1 — e~ Ki ) below 0, for 


x + — k — (1 — e~ Kt ) k 

KK* 


= -(l -«-*), 


i.e. 


CO+x\ = CP = -( l-e~ Kt ). 

i K 


(?) 

( 8 ) 


Hence CP is equal to — (1 —e~ Kt ) and is independent of the direction 
of projection. 

Differentiating (2) with respect to the time we have 


a + ko = 0 , ( 9 ) 

i.e. if the acceleration is a 0 at time t = 0, 

a = a 0 e-*<. (10) 


P 



Hence throughout the motion the direction of the acceleration is the same. Also if u is 
the horizontal component of the velocity and u Q its initial value we have from (5) 


u = u 0 e~ Kt , 

and if d is the distance travelled horizontally in time t, since u = d. 


( 11 ) 


Hence 


and (10) becomes 



( 12 ) 

(13) 

(14) 


2*12. Motion of a charged particle in electric and magnetic fields at right 
angles. If m, — c are the mass and electric charge of the particle, c the velocity of light, 
E , H the electric and magnetic fields in Gaussian units, the equation of motion is 


mx = -eE — -dbKH, 
c 


( 1 ) 


that is, 
Take 


x * m Ei mc eikmXkHm ' 


E = (E, 0,0), H = (0,0, H). 


x, = --E-—x 2 H? 
m me 


*2 = 
x z = 0. 


me 


*1H, 


( 2 ) 

(3) 

(4) 

(5) 

( 6 ) 


Then 
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Let the particle start from the origin with zero velocity. Then x 3 = 0 for all time; the 
motion is in a plane perpendicular to the magnetic field. Multiply (5) by i and add to (4), 
and put x x + ix 2 = z* Then 

eE ieH . 

z = -1-z. 

m me 


Put eHjmc = o; then 


since z = 0 at t = 0; and 


Z — l(j)Z = — 


eE 
m 1 


teE . , v 
z =-( 1 _ e uol\ 

moj ’ 


mo \ 


(e^-iy 


to 


eE 

=- Ai(j)t-e i **+ 1 ), 

mar ' 

eE 

x x =-s(l — cos (lit), 

mo 2 ' 

eE 



x 9 = 


mar 


(wt — sin dit). 


The path is a cycloid with its cusps along the negative direction of the axis of x 2 . 

2*13. Small angular displacement; angular velocity. Let a particle originally 
at P(x) receive a displacement due to a small rotation 86 (right-handed) about a line ON 
through 0 with direction cosines w i . Let a be the angle 
between the axis of rotation and OP. Then to the first 
order in 86 the displacement of P is perpendicular to the 
plane of x and n and has modulus r sin a 86. Hence the 
displacement 8x is given by 


n 


8x = 86. n a x + 0(86)\ (1) 

since |n ax | = r sin a; or if we put 

8Q = n 86, ( 2 ) 

8x = 8Qax+0(86)*. ( 3)0 

£0 is a vector because 86 is a scalar and n i are the direction cosines of a given line. 
It follows that if v is the velocity of P and 86J8t has a limit (n when 8t-+ 0, 

8x 



F 


V = lim 


where 


8t 

to = on. 


(a ax, 


(4) 

(5) 

Conversely, if the velocity x is given by an expression of the form (4) for all t, with <0 
constant in magnitude and direction, we can recognize the motion as circular motion 

* If the student is not already acquainted with the elements of the theory of the complex variable 
he should read the beginning of Chapter 11 at this point. 
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2*131 


with constant velocity. For x. c*> = 0 and therefore the motion is in a plane perpendicular 
to co. Also x.x — 0 ; and therefore r 2 is constant and the motion is on a sphere. It is there¬ 
fore in a circle about ON of radius r sin a. Finally, 

x.x = (to ax), (to ax) — (tor sin a) 2 , ( 6 ) 

and therefore the velocity is constant and the angular velocity to. The sign can be checked 
separately. 

The equation (4) can be regarded as a family of three differential equations for the x if 
namely, 

= e ikm a) k x m . (7) 

The student should carry through the derivation of ( 6 ) for himself for practice in using 
e ikm » (?) can be used to illustrate a method that is often useful when one axis is specialized. 
We take the axis of co as that of * 3 ; then co = ( 0 , 0 , ( 0 ) and 

x x = — (j)x 2 , x 2 = <ox x , x s = 0. (8) 

Then x z is constant. Multiply the second equation by i and add to the first,* and put 
£ = x x +ix o. Then 

t = ia)£, £ = CePefi* (9) 

with 0 and fi real, and the real and imaginary parts give 

x x = G cos ((t)t -f- fi), x 2 = Gsm.((ot+fi). ( 10 ) 

These equations represent uniform motion in a circle of radius G\ and the solution 
contains three adjustable constants as it should, namely, x z , C, and /?. 

We can solve equation 2*12 (1) in another way. Integrating once we have 

6 

mx = —eEt — xaH. (11) 

c 

We put E = == EG(£d a H = He^. (1^) 

Then (11) may be rearranged as 

. eH ( cEt \ 

which we interpret immediately as follows: the particle is moving with angular velocity 

cE 

eH jmc about an axis parallel to e (3) which is itself moving with constant velocity — e (2) . 


2*131. In particular if i and j are two mutually perpendicular direction vectors in the 
( 1 , 2 ) plane and, at time t, i makes an angle d and J an angle \tt + 0 with 01 , we have that 


di 

dt 


= &h 


( 1 ) 



( 2 ) 


* Cf. footnote on p. 79, 
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If at time t the position vector of a particle moving in a plane is x, then the components 
of its velocity dxjdt and acceleration d 2 x/dt 2 resolved along Ol, 02 are (x x , x 2 ) and (x v x 2 ) 
respectively. If i and j are direction vectors along x and perpendicular to it, we have 


and 


x = n, 
dx .. di 
jt= n+r it 

= H + r6j, 


d 2 x 

dt? 


H+fd £ + 7t {r() » +ri> % 

{r-rfr)i + - r j t (r 2 0)j. 


(3) 


(4) 


(3) and (4) give the components of velocity and acceleration resolved along and per¬ 
pendicular to the radius vector. 




Similarly, if t and n are direction vectors along the tangent and inward normal to the 
path of the particle, and if \Jr is the angle between x and a fixed direction chosen so that 
ijr is increasing going along the path in the direction of motion, t, then 

dx dilr 

- = —— ti 

dt dt 

dxlr . 

= —sn 
ds 

8 

~ P™’ 

where s is distance measured along the path from a fixed point and p is the radius of 
curvature (here taken essentially positive). Now the velocity vector v may be written 

v — vx = sx; 

hence the acceleration dv/dt is given by 

dv . v 2 

— = vx-\ — n. 
dt p 

2-14. Angular velocity of a rigid body. Any displacement of a rigid body is 
equivalent to a translation, that is, a motion such that every particle receives the same 
displacement, followed by a rotation about an axis. By definition a rigid body is such that 
in any possible motion the distance between any pair of particles is unaltered. Let a 
particle at O go to O' in the actual displacement. First consider every particle to receive 

JMP 
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a displacement 00' ; this alters no distance between particles. Let this displacement take 
two particles P, Q to P', Q'. Let the positions taken by P, Q in the actual displacement 
be P", Q" . The metrical definition of a plane is that it is a locus of points equidistant 
from two given points. Then the points equidistant from P' 
and P" lie on a plane, and O' is on this plane because O'P' and 
O'P" are both equal to OP. Similarly, points equidistant 
from Q' and Q" lie on a plane through O'. These two planes 
intersect in a line, and every point R' of this line must satisfy 
R'P' = R'P", R'Q' = R'Q". Therefore, since it maintains its 
distance also from the particle at O', it is occupied by the 
same particle in the P'Q' and P"Q" positions. 

Now consider any pair of particles 8, T in their three 
positions. Angles between planes of particles are conserved; 

hence the angle between the planes O'R'S' and O'R'8 " is equal to that between 
O'R'T' and O'R'T". Therefore the P'Q' position can be brought into the P H Q" 
position by turning every particle about O'R' through the same angle. 

Let the position of 0 be a, of O' a + 8a. Let O'R' have direction cosines n i} and let the 
rotation about it be through a small angle 80. Then the displacement of P(x) to P' is 8a, 

( 1 ) 



and 

Put as before n80 — 58; then 


P’P" = n80 a O'P' + 0{86)\ 


O’P’ = OP = x-a, 

PP~" = PP' + P 7 P 7F 

= 8a + 50 a [x — a) + O{80) 2 , 

which gives the displacement of a general particle of the body. 

The velocity of P is then _ 

PP" da .. 56 , 

W = M +lim Tt K{x - a) 


( 2 ) 


(3) 


v = lim 

St —^ 0 


= a + co a (x — a). 


where 


r 

w = hm -si' 


(*) 

(«) 


and is called the angular velocity of the body. Alternatively, we can write (4) as 

= e^j cm (0f c (x m a m ). ( 6 ) 

To check the fact that the set of velocities (6) represents a motion of a rigid body, consider 
the variation of the distance between two particles with coordinates x it We have 

d 

iVi ~ Xi ) 2 = 2 {Vi - X t ) {yi - xj 

= 2 {y^ x$) e^j cm o)j c (y m 

= 0, (7) 

which proves the result. Applying the same argument to small rotations we find that 
distances between particles are unchanged to the first order by a set of displacements 
given by (3); the second-order terms are more complicated. If we neglect them we can 
say that small angular displacements can be compounded by the parallelogram rule, in 
the sense that the sum of the displacements of a particle due to small rotations about 
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different lines is the displacement due to the resultant of the rotations according to the 
rule. In particular 

{8b AX) t —e ikm 80 k x m , (8) 

and the components are {x 3 80 2 — x 2 80 3 , x 1 80 3 — x 3 80 x , x 2 80 x — x x 80 2 ) which are the sums 
of those due to separate rotations {80 x , 80 2 , 80 3 ) about the axes. 

2*15. Forces on a rigid body. For a particle we have the usual equations of motion, 
which can be written 

mx = X, mxi = (1) 

Then for every particle of a rigid body these equations are true. If we simply add them 
and use S to denote summation over all particles we have 

S{mx) = SX. (2) 

If we now write Sm = M, M will be the whole mass of the body; and if further we write 

8{mx) = Mx, (3) 

Xi will be the coordinates of a point, which we shall call the centre of mass; and 

M£ = SX, (4) 

the resultant of the forces on the particles. 

Next, form the vector product of (1) with x; then 

mxAX = xaX, rne ikm x k x m = e ikm x k X m . (5) 

By addition S{iuxax) = S(x a X), S(rne ikm x k x m ) = S(e ikm x k X m ). (6) 

The centre of mass of a rigid body is fixed in the body. This is usually taken for granted, 
but is not obvious. Let us consider r x the distance of the centre of mass from any given 
particle, originally at x x , and let any other particle of mass m l be at x v Then 

{Sm l ){x 1 —x) — {Sm l )x 1 — 

= Sm^-Xi), ( 7 ) 

{Smtfrl = {Sm l {x 1 -x l )}{Sm v {x 1 -x v )} 

— SS'm l m r {x x —x l ){x 1 —x l ,) > (8) 

S denoting summation with respect to Z, S' with respect to V. But 

{x x —x l ){x 1 —x l ') = \{r\ l +r\ L ' — r%), (9) 

with an obvious notation. Hence r\ is expressed entirely in terms of masses of particles 
and distances between them, and therefore is unaltered in any rigid-body displacement. 
Hence the centre of mass retains its distance from every particle of the body. 

It is not enough to consider a particle originally at x, since for a hollow sphere there is 
no such particle. 

The forces on the particles can be imagined to be separated into external forces and 
internal reactions. According to a principle due to d’Alembert, the internal reactions form 
a system in equilibrium among themselves, and their contributions to the right sides of (4) 
and (6) are zero. This principle is perhaps most completely understood if we regard the 
rigid body as the limiting case of an elastic one; but it follows at once if the body is 
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D’Alembert's principle 2-15 

regarded as made up of particles such that the force between any pair is along the line 
joining them (i.e. the acceleration components are in the ratios of the direction cosines 
of the line). For by Newton’s third law the forces add up to zero, and if X' is the force on 
m l due to m r , — X' the reaction, 

-\~cCy aX — (x^ — Xj) aX — 0, 

if JT is along the line from x t to x v . This argument is not general, because the particles 
might be electric or magnetic doublets, in which case the force would not be along the 
line joining them. But the result can still be shown to follow in a much wider class of cases 
subject to the condition that the distances between particles are unaltered. D’Alembert’s 
principle is therefore an approximation valid for real solids provided the deformations 
can be neglected, and if they cannot it is not even strictly true that the centre of mass is 
fixed in the body. The reason for accepting it, however, is ultimately that experimentally 
it leads to the right answers. 

In the right sides of (4) and (6) we need therefore consider only forces acting on the 
body from outside it. Further, six quantities suffice to specify the position of a rigid body, 
namely, the three coordinates of a given particle, and the three Euler angles specifying 
its orientation. These are treated in Chapter 3. But (4) and (6) form six differential 
equations, and six is the number required if we want to know how the body will move. 

SX is the resultant force; L = S(x aX) is called the moment of the forces about the 
origin. Apart from the equations of motion the moment would have no physical interest. 
It should be noticed carefully that the moment of a force is xa X, whereas the velocity 
due to a rotation is <*> ax\ the signs are always obvious if reference is made to a diagram, 
but mistakes in one or other of these expressions are common when vector notation is 
used throughout. If x x and X 2 are positive, the force is clearly tending to turn the body 
from Ol to 02; if a) 3 and x 1 are positive, x 2 is positive. This suffices to fix the sign of one 
term in each component and the rest follow. 

If the system is equivalent to a single force X at x, the moment is xaX = G and 

G.X — 6^j cm Xj c X m X^ — 0. 

It is therefore only in special conditions that the forces on a rigid body can be replaced 
by a single force.* 

EXAMPLES 


1. Prove that — 0, €f kt e mk , — 28 im , — 6. 

2. If A denotes the determinant f u ti ||, prove that 

e ikm& = e jln u ij u kl u mn* e Sln^ — e ikm u ij u kl u mn> 6A = ^ikm e jln u ij u kl u tnn‘ 

3. If l ti are the direction cosines of a transformation of axes prove that 


hi — i e ikm e ilnhl^mn- 

4. z is a constant unit vector, r (the position vector of a moving particle) a variable vector per¬ 
pendicular to it. If the velocity at any instant is given by 


— (re**) = (oz a (re**), 
at 


( 1 ) 


where o) and k are constants, show that the orbit of the particle is an equiangular spiral. 

* For reduction of a general system to two forces or a force and a couple, see H. Jeffreys, Car¬ 
tesian Tensors, Chapter 5; C. E. Weatherbum, Elementary Vector Analysis, Chapter 8. These 
problems, however, occur only in examination questions. 
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A particle is moving in a plane under the action of a force to a fixed point proportional to the radial 
distance and a frictional resistance proportional to its velocity. Obtain the equation of motion of the 
particle, and by seeking solutions of the type (1) or otherwise show that the velocity at any instant 
is the vector sum of the velocities of the two particles describing equiangular spirals in opposite 
directions with equal angular velocities. (M/c, Part II, 1931 ) 

5. A,B,G are any three points on a sphere, centre 0, of unit radius. The position vectors of A, B, G 
relative to O are u, v, w respectively. Show that the diameter which is perpendicular to the plane 
ABC cuts the sphere in the points whose position vectors are ±d, where 

[«,t>,w]dsecfl = VAW+W AM+MAC 

and 0 is the angle between d and «. 

By considering the product tc Ad, or otherwise, prove that 


[w, v, w] tan d = + 4 sin ^ sin^ sin |, 


where a, b, c are the sides of the spherical triangle ABC. 

6 . If A, B, C, D are any four vectors, prove that 


(Prelim. 1941.) 


(AaB).(CaD) = (A.C)(B.D)-(A.D)(B.C), 

(AaB)a(CaD) = [C,D,A]B-[B,C,D]A 
= [ D,A,B]C-[A,B,C]D , 

where [A,B, C] denotes the triple scalar product A.(BaC ). 

Deduce the sine and cosine rules of spherical trigonometry. (Prelim 19£0 ) 

7. Two particles are projected simultaneously from the origin with velocities v lt v 2 respectively 
and move under a constant acceleration o. Prove that if v x .v a < 0 , there is one and only one instant 
during the subsequent motion at which the particles subtend a right angle at the origin. 

Show that at this instant the position vectors r lf r 2 of the particles satisfy the equation 

a^r^Uj — r 2 . v 1 ) + (a. r 2 — a.r t ) (a.v 1 +a.v 2 ) + 2v x .v i (a.v a —a.v 1 ) — 0 . 

(Prelim. 1940.) 

8 . Find an expression for the position vector r, at time t, of a particle of unit mass which moves 
under the action of a constant force (n 2 + **)&, together with an attractive force of magnitude 
(n a + k*) r towards the origin (where n=f= 0 ), in a medium which produces a retardation 2k times the 
speed. At time 2 = 0 the particle has velocity v and is at t = n. 

Deduce that the triple scalar products of r, v, a-b and of o, b, v are equal. (Prelim. 1941.) 

9. A particle of charge e and mass m moves under the action of a uniform electric field of intensity 
(0, E, 0) and a uniform magnetic field of intensity (0,0, H), Gaussian units being used. Prove that the 
motion can be regarded as the constant velocity (Ec/H, 0,0) superposed upon uniform motion in a 
circular helix with angular velocity - eff/mc about the axis. It is to be assumed that the variation of 
mass with velocity is negligible. 

Prove that, if the particle starts from the origin, then, whatever its initial velocity, it crosses each 
of the straight lines x = 2nnmc 2 EjeH !i , y = 0, where n = 1,2,3. (M.T. 1943.) 

10. A particle of mass m at r is acted upon by a central force fir together with a force e(HAr)/c 

where H is a uniform magnetic field. Show that if r and rare initially perpendicular to H the particle 
will describe a plane curve. * 

Show that the particle can describe a circle about the origin under these forces, with either of two 
constant angular velocities. ‘ q 1942 ) 

11. _Determine a vector OC perpendicular to OA = o(2,3,0), OB = b{ — 2,0, l)suchthat the rotation 
from OA to OB is positive about OC. Calculate the volume of the tetrahedron OABC. 

Find the sides and angles of the spherical triangle ABC defined by 


OA = 


o*=(io,±), 00 . ( 0.-L--I). 








Chapter 3 

TENSORS 


We know that intellectual food is sometimes more easily digested, if not taken in the most 
condensed form. It will be asked. To what extent can specialized notations be adopted with 
profit? To this question we reply, only experience can tell. 

v. cajobi, History of Mathematical Notations, p. 77 


3*01. In this chapter we develop the theory of tensors in a simple and restricted 
form. In many branches of physics the tensor notation in this form provides a 
compact mathematical expression, and familiarity with it is a preparation for the 
complete theory, involving the use of oblique axes, curvilinear coordinates and space 
of more than three dimensions; it is also an introduction to the ideas of matrix algebra. 
General tensor theory is indispensable as the mathematical apparatus of the theory of 
relativity, and matrix algebra in quantum mechanics and much of classical physics find 
their clearest expression in this notation. In the applications made in this chapter the 
physical ideas involved are simple, and practice in using the notation in this way is 
extremely valuable before proceeding to the applications of its complete form to theories 
where the physical ideas themselves are more difficult to grasp. 

3-02. Transformation of coordinates. Contraction. We have defined a vector A 

by the transformation property m 

Aj = lijAfa v-v 

which is equivalent to — lyAj. ^ t 2 • 0 ($ (2) 

A vector can also be called a tensor of the first order. A scalar is a tensor of zero order. 

Now if we consider the set of nine products A^B k we notice that the scalar and vector 
products are particular linear combinations of these products. If we form a similar set 
for the components referred to new axes 

AjBi = lifijaAiBk, (3) 

AiB k = lyljaAjBi. (4) 


In the same way as we use the transformation property to define a vector we now use 
these relations to define a tensor of the second order. A set of quantities depending on 
two directions and specified by nine components K ik referred to 0123 and Ky referred 
to Ol'2'3' forms a tensor of the second order if for all changes of axes 


K'ii = hihiKik, (®) 

or the equivalent relation K ik = (6) 

The two suffixes denoting the component of K refer each to one of the coordinates of the 
same system. The direction cosines do not form a tensor since the two suffixes refer to 
axes of different sets. 

In the square array 


/*u 

Aj 2 

*i»\ 


( 


o. 

(7) 


-^32 

kJ 
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the components K u , K 22 , K zz are called the diagonal components. Their sum K u is caDed 
the trace or spur and is a scalar. For 

K'jj = lijljcjKu = 8 ik K ik = K u . (8) 

The operation of putting two suffixes equal in - a tensor and then summing is called 
contraction. The order of the tensor is reduced by 2. 

The sum of two tensors K and L is defined by 


(-^ "b I^ik ^ik "t" I'ikt 

and is clearly also a tensor. 

Tensors of higher orders are defined in a similar way; that is, a tensor of the nth order 
transforms like the product A i B k C m ... to n factors. We shall be mainly concerned with 
second-order tensors with some use of third and fourth order ones. 

3*03. Isotropic tensors. We can show that the set of quantities 8 ik constitute a 
tensor. For if we apply the transformation (5) we get a set of quantities Z7' 7 given by 

U'n = hjlki^ik — hjhi = 1 0 = 0>] 

= o (9) 

and therefore the set 8 ik transforms into 8 j{ on any rotation of axes. 

Similarly, we can show that a third-order tensor with components e ikm referred to 
0123 has the same set of components referred to 01'2'3'. For on transformation we get 
for the jin component Expanding, we have, since i, k, m are all different in 

non-zero terms, 

hi ^uhn + h j hihn + hj ^uhn 

— l 2 j lyi l Zn — l S j lylin — Ijj l si ^2w 

If j = l all components cancel; similarly, if j = n or l = n. If j, l, n are all different the 
expression is 

hj I'll hn 

hj hi hn > 

hj hi hn 


which is equal to 1 if j , Z, n are in even order and — 1 if they are in odd order. Hence e ikm 
transforms into e jln under the rule for tensors of the third order. 

Tensors whose components are unaltered by rotation of the axes are called isotropic. 
It can be shown* that there is no isotropic tensor of the first order, and the only ones of 
the second and third orders are scalar multiples of 8 ik and e ikm . There are three independent 
ones of the fourth order, namely, 


^ik^mpy 

^im^kp + ^ip^knv ‘ 
SS * _ 

°kp °km • „ 


( 10 ) 


We have met the last as an alternative expression for e iks e mp3 . The other two appear in 
the derivation of the equations of motion of viscous fluids and elastic solids. 


* H. Jeffreys, Cartesian Tensors, Chapter 7. See also note 3-03 a. 
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3*031. Isotropic tensors of order 4. If u ikmv is an isotropic tensor of order 4 we have 


u jln<i — lij Ikl'mn Ijtq u ikmv — u ]lna 


( 1 ) 


for all rotations. Having regard to the fact that at least two suffixes must be equal in any component 
we see that the components fall into four patterns typified by « U11 , « m2 , Mn 22 , w U23 . 

First rotate the axes about a line with direction cosines 1/^/3, 1/^/3, 1/^/3 so as to bring axis 1 into 
coincidence with the original axis 2, and so on. Since the result is a cyclic interchange of suffixes it 
follows from the isotropic property that 

W llll ~ w 2222 = W 3333> W 1122 = M 2233 = U 3311’ M 22U = W 3322 = W 1133> W 1221 = W 2332 = W 3118> e ^ C * (^) 


Next rotate through 90° about 03. Then 

Z 12 = 1> ^21 = — 1» ^33 = 1 (3) 

and the rest are zero. Take,? = 3, l = n = q = 1. Then the non-zero terms are for i = 3, k = m = p = 2, 
and 

^3111 = ^ 3222 * (^) 


Take also j = 3,l = n = q = 2. Then we must take i = 3, k = m = p = 1, and 

W 8222 = M 3111* (®) 


By similar methods it follows that all components with three suffixes equal and the other different 
are zero. 

Similarly, we find, with 


j = 1=1, n = q = 2; 
j = n = 2, l = q = 3; 
j = q = 1, l = n = 2; 
j = 3, l = n = 2, gss 1; 
j=3, l = n = 1, ? = 2; 


i = k = 2, m = p = 1: 
i = m = 1, k = p = 3: 
i = p = 2, k = m = 1: 
i = 3, k = m = 1, p = 
i = 3, k = m= 2, p = 


2 : 


Is 


w 


1122 


M, 


2211 * 


W 2323 — M 1818> 
W 1221 = M 2H2*J 
W 3221 = ~ U \ 
W 911 a = Woo. 


r) 


( 6 ) 

(7) 


Hence the only non-zero components are those with the suffixes all equal or equal in pairs; and by 


cyclic interchange 

%U = M 2222 = M 3338 = K ’ (®) 

W 1122 = M 2211 = M 2233 = W 3322 = W 33U = W 1133 = (9) 

W 2S2S = W 1818 = W 3131 = W 2121 = W 1212 = W 32S2 = /*» (1®) 

w 1221 = W 2112 == W 2332 = U S223 = W S11S = W 1881 = V ’ (I I) 

These relations would all be satisfied for cubic symmetry. We can now write 

= ^ik 8mv + Mim $kv + V $iP ^km + (K — X — /l — V) V ikmt) , (12) 

where v ikmv = 1 if all four suffixes are equal and otherwise zero. Now if u ikm9 is a tensor of order 4, 
Ui kmi) x { ykZ m w p is a scalar, and conversely (see 3-05). This expression reduces to 

Xx i y i z m w m +/ix i z i y k w k + vx i w i y k z k + {K-A.-fi-v)(x 1 y 1 z 1 w 1 + ...). (13) 


The first three expressions are all products of scalars. But the last, if we take all the vectors the same, 
is x\ + x\ + x*, which is not a scalar. It has cubic symmetry but not spherical symmetry. E.g. if 
Xl = x 3 = x 3 = 1/^3, x{ + x\ + x\ = 1/3; but if x[ = l,x 3 = x 3 = 0, x k + x 3 + x' 3 = 1; and this change of 
components would be achieved by a rotation of axes. Hence if the tensor is isotropic 

k — A— fi — v = 0, (14) 

and the most general isotropic tensor of order 4 is given by the first three terms of (12). It can be 
rewritten as the sum of three tensors of the forms 3*03 (10). 
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For any solid the elastic constants form a fourth-order tensor, which must be isotropic if the solid is. 
Th© v ikmt> term expresses an extra generality and permits the expression of the elastic properties of 
a cubic crystal: Young’s modulus can have different values for strains along a diagonal and parallel 
to an edge. 

3*04. Dyadic notation. It is sometimes convenient to denote a tensor of the second order by 
a single letter, as we do for a vector. If we multiply all components of a tensor K ik by those of a vector 
A m we get a tensor-of the third order. But we can form from this product two different vectors by 
putting m equal to tor A; and su mm i n g, namely, K ik A t and K ik A k . In dyadic notation these are written 
A . K and K . A respectively, the rule for remembering which is which being that the order taken in the 
product is such that summation is always taken over adjacent suffixes', thus 


(A.K) k = A t K ik , (K.A)t = K ik A k . (l) 

The proof that contracting a tensor of order n gives one of order n - 2 is similar to that of 3-02 (8) and 
need not be given in full. The use of heavy type can be taken as an indication that one or more suffixes 
are suppressed. 

Similarly, we can form contracted products of two tensors K and L of the second order, namely, 

(&’^ J )tk if* ‘K)ik ~ ^im^-mk’ ( 2 ) 

Again in general the result depends on the order; this type of multiplication is not commutative. 
In this notation the tensor A t B k is written asAB; the absence of the dot distinguishes the tensor from 
both the scalar and the vector products. Dyadic notation has analogies with matrix notation, which 
will be developed in the next chapter. The compression introduced by the suppression of the suffixes 
is compensated by the extra care that has to be taken to preserve the order, and by the fact that we 
sometimes do not want to contract. In particular the elastic constants of a crystal form a fourth-order 
tensor. 

3*05. The quotient rule. If we have a set of equations 

K ik A k = B { , . (1) 

where A k and B t are known to be first-order tensors, or if 

KA = S im , (2) 

where T km and S im are second-order tensors, can we conversely infer that K ik is a second- 
order tensor ? The answer is that we can, provided that all the components A k or can 

be varied independently. We take the simplest case, starting from (1). We transform the 
axes; then we do not know how K ik transforms but it must give a set K' iX with nine com¬ 
ponents specified by dBjjdA't. Then 

KflA'i = -BJ = lyBi = lijKaAjt 

— lijlklKik-A'l' (3) 

Hence (K'^ — l kl K ik ) A\ = 0. (4) 

But if this is true for every j, when each A\ is Varied separately, 

Kji — lyltoKa, (5) 

and K ik is a second-order tensor. 

An important particular case of the general theorem is that if K ikm T ikm is a scalar, 
then if T is an arbitrary nth-order tensor K is also an nth-order tensor. In particular, the 
coefficients a ik in a quadratic form a^x^, where a ki = a ik , in the coordinates of a point 
form a tensor of the second order. 
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Differentiation with respect to position 3*06 

3*06. Differentiation of scalar and vector functions of position. A scalar, or the 
components of a vector or a tensor, may have different values at different places, even if 
we consider them only at a single instant of time and do not transform the axes. Thus the 
different particles of a body differ in distance from the origin, but the distance of a given 
particle does not depend on the directions of the axes. Such a function is called a scalar 
function of position. Again, the velocity of a particle of a fluid is a vector, but in general 
varies with position, and can be called a vector function of position. The existence of such 
functions makes it necessary to consider their differentiation. 

If 0 is a scalar function, the set of three derivatives d<^/dx i specifies a vector denoted by 
grad 0 or V0, To show that it is a vector we rotate the axes; we have 

00 = dxtty_. 00 

dx'j dx' j dx i xi dx i ’ ' ' 

which proves the result. This assumes that 0 is differentiable in three dimensions in the 
sense of Stolz and Young: a sufficient condition for this is that the partial derivatives of 0 
are continuous with regard to all the coordinates. Cf. 5-04. 

If u t is a vector function, 


du\ 

dx'j 


dx { du\ 
dx'j dx t 


dui 


hi fa (hi u k) - hi hi ~fa ] 


( 2 ) 


and therefore du k /dx { is a tensor of the second order. 

It follows that dufdXi is a scalar function; it is usually denoted by divtf or V.it. If, 
further, there is a function 0 such that u t = 00/0» <f 


dUt 0 2 0 0 2 0 0 2 0 0 2 0 

dx t dx 2 dx\ dx% dx z ’ 


This combination of second derivatives has an importance in mathematical physics 
second only to differentiation with regard to the time. It is denoted by V 2 0. 

The quantities 


_ (du 3 du 2 du x du 3 du 2 0% x \ 

t1cm dx k \0x 2 dx z ’ dx z dx x ’ dx 1 dx 2 J 


Idw dv du dw dv 0w\ 

(4) 

\dy dz’ dz dx ’ dx dyj 

in Cartesian notation, determine a vector. It is denoted by curltt. (Vam is 
will be noticed that if u is the gradient of a scalar, curlw = 0: 

also used.) It 

(curlgrad^ = e ite A|£ = o. 

(5) 

Also if u is any vector function 


t. t 0 0w„ d 2 u m 

d,vouriu - axl - e ^ SXi Sx k - °- 

(6) 


since all the terms cancel. 
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A useful result is obtained by taking the curl again. We have 

(curl curl w)* = e ikm ^e mps ^ 

* P dht 

8%, 

dx 3 dx i dx\ 

= (graddivw —V 2 !*)^ (7) 

3*07. Symmetry properties. If K ik is a tensor, we show that K ki , obtained by inter¬ 
changing rows and columns, is another. This means that if K ik transforms into K' jU 
according to the rule K' =1 l K (1) 

yl Jcl ik> V / 


and we write L ik = K ki , L ik will transform according to the same rule. But 

^ij^kl^ki’ (^) 

Here i and k are repeated suffixes; it is therefore immaterial which we call i and which k, 
and therefore they can be interchanged, giving 


lukiKik = lul kj K ik = K'ip (3) 

so that the transformed set also differs from K' n in having rows and columns interchanged. 

If K ik — K ki , the tensor is said to be symmetrical-, if K ik = — K ki , it is said to be anti- 
symmetrical. Again, if K ik is a tensor, two others are K ik + K ki and K ik — K ki . The first of 
these is unaltered if i and k are interchanged and is a symmetrical tensor; the second has 
the si gns of all components reversed and is an antisymmetrical tensor. Since any tensor 

K%k = \{K ik + K ki ) + \{K ik — K ki ), (4) 


K ik can be written as 


it can be expressed as the sum of a symmetrical and an antisymmetrical tensor. 

Since K ik A k and AiK ik are vectors we can form their scalar products B i K ik A k and 
AiK^Bjc with another vector B. These products are not in general equal. But if K ik is 
symmetrical they are equal, for 

B.K.A = BtKaAj, = BiK ki A k = A^B, = A.K.B. (5) 

On the other hand, if K ik is antisymmetrical the sign is reversed in the third of these 
expressions, and B.K.A = —A.K.B. (6) 


It is important to notice that if K ik is symmetrical and u t a vector, every term in 
1'Z ik u i u k with i^pk occurs twice. Thus for i = 1, k = 2 we get a term K 12 u x u 2 , but there is 
another term with i = 2, k = 1 equal to K 21 u 2 u v If K 12 = K 21 these terms are equal. 
Thus the expansion of 


is 


and 


T = K ik UiU k 
K xl u\ + 2K X2 u x n 2 + K 22 u\ + 2K xz u x u z +..., 
dT 


du x 


— 2(K xx n x + K xz u z +...) 


ZZikUk, 


and in general 


0T 

dui 


^K-i k u k . 




92 Vector and antisymmetrical tensor 3 *071-3-072 

3*071. The vector of a tensor, vec If. Consider the triad e ikm K km . This is the twice 
contracted product of tensors of the third and second orders and therefore id a vector; 
alternatively, by changing axes we have 

e jln^ln ~ e jln}kl^mn^km = e ikmhj-^km' ( 7 ) 

We can therefore write this as 2 vec K, where 

(vec K) t = $e ikm K km ; in components, {\{K 2Z -K Z2 ), \{K Z1 -K 1Z ), \{K X2 -K 2X )}. (8) 

3*072. Relations between an antisymmetrical tensor and a vector. The com¬ 
ponents of an antisymmetrical tensor W ik with i = k must vanish, and since for the others 
W ik = W ki only three independent quantities need be given to specify an antisymmetrical 
tensor, which then takes the form 


0 

W 12 

~ ^3l\ 


-W 12 

0 

W 23 . 

(9) 

W 31 

-w n 

0 / 



But W 23 , W zl , W 12 are the components of vec W. We shall denote this vector by w, that is, 

w i = R-fcmWfcm- (10) 

This property, that the number of components of a vector is equal to that of the in¬ 
dependent components of an antisymmetrical tensor, is peculiar to t hr ee dimensions. 
In n dimensions an antisymmetrical tensor has \n{ri — 1) independent components, 
while a vector has n components. 

It follows that the set of quantities in (9) is the same as the set 

0 w z —w 9 
~w z 0 w x 
V0 2 —w x 0 

that is, W xz = w Z) W zx = — w z , and in general 

Wa = 0 (i = *), 

— w m (ikm in even order), 

~ —w. 


( 11 ) 


and therefore 


( 12 ) 


v m (ikm in odd order), 

V^iic ~ ^ikm^m' 

It is sometimes convenient to use the vector and sometimes the antisymmetrical tensor 
representation; equations (10) and (12) give the relations between them. 

In particular, if we take a vector product (w a A), 

(w a A.)^ — e^ km w k A m = ^c ikm e kp8 W p8 A m 

2^mik^psk Ups 

= ^mp 8 is- 8 ms 8 ipW pa A m 

= U W mi-Wim) A m 

( 13 ) 


s= — W, m A m . 

r im nv 
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Hence we can replace vector multiplication by the vector w by multiplication by the 
antisymmetrical tensor - W ik . Alternatively, we can derive the result by writing out a 
special component, say i = 1. 

In physical applications it is sometimes the one and sometimes the other that appears 
first most naturally. In deriving the equations of angular momentum, for instance, we 
start from x { = X { ; multiply the k equation by x m , the m equation by x k , and subtract. 
Then we. have the nine equations 

m{x k x m -x m x k ) = X k x m -X m x k , 

m which both sides are antisymmetrical tensors. The reason for converting these equations 
into the vector form is that this eliminates the three that have the form 0 = 0 and three 
others that can be inferred from those retained by a change of sign. 

3*08. Symmetrical tensor: principal axes. We have seen that an antisymmetrical 
tensor can be related to a vector. A symmetrical tensor can be related to a quadric. If 
Ki k is a symmetrical tensor with real components the equation 

K^XiXj, = constant ( 1 ) 

represents a central quadric with centre at the origin. Now 

Xi = sd t ( 2 ) 

represents a line through the origin, and the polar plane of a point on it is 

Kikh x i = constant. ( 3 ) 

This plane is perpendicular to the line if 

P-ik^k = Mi> ( 4 ) 

where A is the same for i — 1,2,3. The condition for consistency of these equations is 
a cubic equation for A, and any root will in general give admissible ratios of the l t . 
These will be real if A is real, and then the line will be perpendicular to the polar plane 
of any point on it and in particular to the tangent plane at the point where the line 
meets the quadric. Such a line is a principal axis. 

We show first that if there are two solutions corresponding to different values of A, say 
A x and A 2 , they give values of l { , say l ix and l i2 satisfying l ix l i2 = 0. If Aj and A 2 are real 
this says that the lines are perpendicular. Then 

Kikhi == ^ihi> (5) 


P-ik ^k2 — A oljo. 


2 H2- 


Multiply these respectively by l i2 , l ix and contract; then 


Kikhzhi — 


But since K ik is symmetrical the expressions on the left are equal. Therefore 

(Ai~ A 2 )^i^ 2 = 0, 

and if A x # A 2 , 

hihz = 0 . 


( 6 ) 

(V) 

( 8 ) 

(9) 

( 10 ) 
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The condition of consistency for (4), with Z* 4= 0, is that the determinant || K ik — A£ ifc |j = 0; 
that is 


#n-A 

#12 

*13 


*21 

#22 “A 

© 

II 

eo 

(11) 

*31 

*32 

*33 A 



This must have one real root; call it A x . Take the resulting l ix as the direction cosines of a 
new axis of x' x , and take two axes x' 2 , x' 3 perpendicular to this. Then if accents indicate 
direction cosines with regard to the new axes, l' n = 1, Z 21 = Z 31 = 0; and by (4) 


^1 = ^, Ki 2 = 0, K [ 3 = 0. (12) 

The equation (11) therefore now takes the form 

A x -A 0 0 

0 #22 “A K f 23 = 0. (13) 

0 #23 *33~A 

Hence A == A x (14) 

or A 2 - (#' 2 + K’ 33 ) A + #' 2 #' 3 - (#' 3 ) 2 = 0. (15) 

Equation (15) has real roots, since 

(#22 + #3 3 ) 2 — 4{#22#3 3 — (#2 3 ) 2 } = (*i«~ *3 3 ) 2 + *(#' 3 ) 2 > 0. (16) 


If this expression is zero the roots are equal, and conversely. If the roots are different 
there are three real perpendicular directions satisfying (4), and they are called the prin¬ 
cipal axes and the values of A the principal values of the tensor. When the tensor is referred 
to the principal axes x'j it takes the form 

(K o o\ 

( 0 A a 0 J , (17) 

\0 0 A J 

and is said to be reduced to diagonal form. Then with respect to the original axes 


*<* — hjljaKji — A x l kl + A 2 l i2 l k 2 + A 3 liz W 


If two roots are equal, take them to be A 2 ; then # 22 = # 33 , # 23 = 0 and the quadric 
becomes 

Ai^i 2 + A 2 (z 2 2 + x 3 2 ) = constant. (19) 

It is therefore a surface of revolution and any line in the plane of x 2 and x 3 is a principal 


axis. 

If all three roots are equal the quadric is a sphere. 

In both these special cases it remains true that we can find three perpendicular direc¬ 
tions satisfying (4); but we can now do it in an infinite number of ways, whereas when all 
the roots are unequal we can only do it in one way. 
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3*081. Inertia tensor of a rigid body. Consider a rigid body moving with angular 
velocity <o with one point 0 fixed. If a particle P{x^) has mass m we have 


x m^'k) ~ ^ in ( x k%m~ x m^k) 


$( x k-^-m x m^-k)’ 


( 1 ) 


which is a relation between antisymmetrical tensors, expressible also in vector form 

d 

d£ e ikmXk~ $ e ikm X k-^-m 

or ~Sm(XA±) = S(x aX). 


( 2 ) 

(3) 


The expression Sm(x a x) is called the angular momentum of the body about O and denoted 
by h(0). Now since O is fixed, if to is the angular velocity, 


that is, 
and 


X = to AX, 

hi(0) = Sme ikTn x k e mps o) p x a 

Sm(d ip d ks 8i k 8 ps ) x k x s (t) p 

= Smixloii-XiXpOip) 

= Iik^k> 


where I ik is the symmetrical tensor 
In dyadic notation 


Sm^S^-XiXj,). 
h(0) = / . to. 


(4) 

(5) 


( 6 ) 

(7) 

( 8 ) 


I ik is called the inertia tensor of the body about O. Written out in full it is 

/ 8m(x\ 4- £§) — 8m x x x 2 — Smx 1 x 3 \ 

Smx x x 2 Sm(xl+xl) —Smx 2 x s ] . (9) 

Smx 1 x 3 —Smx 2 x 3 Sm(x\+x |)/ 

The diagonal components are the moments of inertia about the axes, and the non-diagonal 
components are the products of inertia multiplied by —1. Since I ik is a symmetrical tensor, 
axes can be found such that the products of inertia vanish and the tensor takes the form 

f A 0 0^ 

10) 



A, B, C, are the principal moments of inertia at O. It is readily proved that 

(1) The moment of inertia about a line with direction cosines n t is 

Ik n i n k = n ( I ik n k = n./.n. (11) 

(2) The product of inertia with respect to two perpendicular lines with direction 
cosines n iy n\ is 

~n i I ik n k = —n.I.n'. ( 12 ) 
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(3) If the. centre of mass of the body is O with coordinates relative to 0 , and if / iA .(0) 
and I ik {G) are the inertia tensors at O and O respectively, and 8m — M, 


lik(0) = I ik (G) + M(^ m 8 ik -x i x k ). (13) 

(4) The kinetic energy of the body, moving with 0 fixed, is 

¥ik(°) = £co. 1(0). co. (14) 

(5) The kinetic energy of the body moving in any manner is 

iMV+lIaWoM = *JfP.F+*«. 1(G). co, (15) 


where V is the velocity of the centre of mass. 

Since I ik (o k is a vector, its components about any set of axes can be written down at 
once. In particular, if we take as axes the principal axes of inertia the components of 
angular momentum about them are {Aa) v B(o 2 , C<o 3 ), and for a rigid body A, B, G are in¬ 
independent of time. It is this fact that makes the use of moving axes convenient. 

3*09. Finite rotation of a rigid body. We have shown* that a finite rotation of a 
rigid body about a fixed point cannot be represented by a vector in the direction of the 
axis of rotation. We now show how such a rotation can be 
represented by a tensor. 

We take the origin at the point O of the body, and 
0123 is the frame of reference. Let P(# t ) be a point of 
the body. The body is rotated through an angle 6 about 
a line through 0 with direction cosines n t , and P moves 
to Q{yi). Let M be the projection of P on the axis of 
rotation; then M is n k x k n i = pn t say. The rotation dis¬ 
places P through (1 — cos 6) PM towards M, and through 
PM sin 0 at right angles to the plane OPM. If Q is a 
right-handed rotation the latter displacement is in the 
direction of the vector product hax. To get its magni- ^ 
tude, let the angle MOP be a; then the modulus of n ax is 
0Psina = PM. Hence the second part of the displacement is sin 6 (n ax), and 

Vi ~ x i = - ( 1 - cos 6) - prii) + 8md(nAX) i} 

Vi = 008 6x i + (1 ~ c< > s d ) n i n k x k + sin 6e ikm n k x m 

= {cos 68 ik -Kl-costf)*^-sin de ikm n m }x k . (1) 

The quantity in brackets { } is clearly a tensor of order 2, which we may denote by R ik . It 
is neither symmetrical nor antisymmetrical. 

If the body undergoes successive rotations represented by tensors R% Rf k , jRjJ, R$, 
then the final position of P is given by 



x, 


,(») 




We have seen in 2*03 that the order of the rotations is important. 


* See 2-03, p. 62, passage in small type. 
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3*091. When 6 is small, R ik reduces, to the first order, to 8 ik — 6e ikm n m and 

y-x = Qax, (2) 

where 6 = On. (3) 

The resultant of two successive small rotations 0 , 0 ' is 

(0 + 00 a®, ( 4 ) 


and in this sense a small rotation is represented by a vector. The same result was obtained 
in Chapter 2, but it was not possible there to give explicitly the terms of order Q 2 neglected 
in its derivation. 


3*092. Tensor representation of angular velocity. We have shown in Chapter 2 
that there is a vector to representing the angular velocity of a rigid body, and that the 
velocities v P , v Q of two points are connected by the relation 


v Q = v p + q>aPQ. 


( 1 ) 


If PQ has components x i and we write the components of v P and v Q 
by 3*072 (13) that if 

^ tk 

v?-v? = -£l ik X k . 


as vf, we have 
( 2 ) 
(3) 


The form (3) is in a sense more general than (1). A rotation about 03 is the same thing 
as one from Ol towards 02, in three dimensions. In any number of dimensions (> 2) we 
can speak of a rotation from Ol towards 02; but it is only in three dimensions that such 
a rotation can be said to be about any particular axis. We shall consider this further in 
the next chapter. 

3*10. General motion of a fluid. When the particles of a system are not constrained 
to remain at constant distances apart, the motion can no longer be specified by the 
velocity of one point and an angular velocity. 

Let x t be the position vector of a general particle P of the fluid and let v t be its velocity 
at a given time. Then v { is a function both of x { and of t. The velocity v t + 8v t at a neigh¬ 
bouring point Q {x t + 8x { ) is 

»<+*’< = v < + ^x k +0(Sx k y. (1) 

To the first order in 8x k we have therefore 



— e i k 8x k — £ ik 8x k , (2) 

say. Then e ik is a symmetrical tensor and E, ik an antisymmetrical one, both of dimensions 
1/t. The part of 8v { depending on £ ik is the same as the displacement due to a rigid-body 
rotation with components (£ 23 , £ 31 , £ 12 ). We shall see in a moment what the other part 
represents. Consider the rate of change of \PQ 2 \ this is 

8xi 8vi = (e ik 8x k - £ ik 8x k ) 8x t , (3) 

and the part depending on £ ik is zero because £ ik is antisymmetrical and all the terms 
cancel. Changes of distance between neighbouring particles therefore depend entirely on 

JMP 


7 
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the e ik . Of these, any one can be different from zero and the others zero except of course 
the one required not to be zero by the conditions of symmetry if i #= k. Thus if dv x = edx x , 
8v 2 = 8 v 3 = 0 we have e 11 = e and all the rest 0. If Sv i = (e8x 2 , eSx x , 0) we have e 12 — e 21 = e 
and all the rest 0; and similarly for the other components by symmetry. The corre¬ 
sponding changes in the plane of 8x 3 — 0 are illustrated. 


i i 

i i 

i i 

i i 

l r 


e lx e 22 e l2 

tx 

x x 


The tensor e ik therefore represents the rates of change in size and shape of an element of 
fluid surrounding P. It is called the rate of strain tensor. It has three principal axes, and 
the changes can be reduced to three extensions along them. If the principal values are 
equal the rates of extension are the same in all directions, that is, the strain near P is a 
symmetrical expansion or compression. 

In a certain sense the £ ik represent a local angular velocity; but this statement requires qualification 
because the e ik obviously imply angular velocities, though these are in opposite senses for different 
parts of the element, and without further restriction an angular velocity of an element round P has 
no definite meaning. We consider a small element of fluid with P as its centre of mass, and suppose a 
small rigid body with the same density distribution to have the same angular momentum about P. 
Then we shall show that, provided the principal axes of the inertia tensor I ik (P) coincide with those 
of e ik , the angular velocity of the rigid body is o> f = \e ikm £ km = |(curl-y) t -. 

We have, if h t is the angular momentum of the element considered, 

hf = I ik a> k = Srne ikm 8x k (v m + 8v m ) 

= 6i km v m Sm8x k -\- Sm6i km 6 mj) 8x k 8xp Srn £,■ km £ m p 8x k 8x p . (4) 

Now since P is the centre of mass of the element, 8m8x k = 0. Also 


Sm8x k 8x v = — I kv +S kp Sm{8x a ) s . 


( 5 ) 


Suppose now that I ik and e ik have the same principal axes; that is, we can take the axes so that 
I ik — 0, e ik = 0 unless i = h. Then 

^ikm^mn^kp = ^ikm^mp8 kv — 0 , ( 6 ) 


since an d p8 k ^ 

right of (4) is 


vanish unless m = k and e ikm = 0 if m = k. Hence the second term on the 


{~I k p+Sm{Sx s ) 2 8 kv } e ikm e mv = 0. 


(7) 


If we write g mi) = e amp £ s , £ k = ie ikm £km> (of. §3-072, (10) and (12)) 

the last term in (4) becomes 

— ^ikm^smp^s8x k 8x p [( 8x ^) 8^ k £ k — lik^Jc* 

which is the an g ular momentum of a rigid body filling the element with inertia tensor I ik and angular 
velocity £ k . Hence 

Iik M k = Iik£k> ( 8 ) 

and therefore oj k = £ k provided || I ik || 4= 0. 

Conversely, if o) k = £ k , it follows that 

Sme fkm e mv 8x k 8x p = 0. 


( 9 ) 
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Refer to principal axes of 7«. Then Sm$xJz P = 0 if k*p. Then, for example, if i = 1 , the only non¬ 
zero terms in (9) are for k = p = 2, m = 3, and k - p = 3, m = 2; and 


e 82 £m(&r a ) a — e ti Sm(Sx 3 ) i = 0 . 

Hence either e 23 = 0 or J 2# = 7 88 . 

Similarly, we see that e 31 = 0 or 7 88 = 7 U , e 12 = 0 or 7 U = 7 22 . 

If then the principal axes of the element are determinate (7 U 4 = 7 aa 4 = 7 sa ), W e have 
and therefore the principal axes of and 7 * coincide. ** 


( 10 ) 
( 11 ) 
( 12 ) 
0 for i 4 =&> 


3*101. Elastic strain. The analysis of displacement in an elastic solid is almost 
identical with that of velocity in a fluid. It is convenient to consider the particle at P(x i ) 
at time t to have already received a small displacement u t , so that its undisturbed position 

was at Xi-u^ Then if + is the displacement at Q(x i + dx i ), we have in just the 
same way 


^ik^Xjg ^ifcSXfc, 

where e ik and g ik are respectively symmetrical and antisymmetrical tensors. They can 
be interpreted as giving the changes of size and shape of an element, and its rotation; in 
a fluid, since v t is there taken to be the velocity, the e ik and g ik there defined correspond 
to the rates of change of those defined here for an elastic solid. 

3*102. Stress. In the interior of a substance, whether solid, liquid, or gas, there are 
in general reactions between the parts. The general nature of these can be seen by con¬ 
sidering how we can apply forces to the outside of a solid with, say, one face clamped. They 
can be applied to any part of the accessible sur¬ 
face; and we can either press or pull normally 
on the surface or apply a tangential drag, as by 
friction. The notion of a state of stress extends 
this notion of a force across a surface to all 
elements of surfaces, even in the interior. If dS 
is a small element of surface, with its normal in 
the direction n, we speak of the reactions acting 
across dS, and representing the forces between 
the particles on opposite sides of it. The com¬ 
ponents of the force depend both on the size 
and on the direction of dS and therefore are 
written p ni dS. In particular if n is in the 
direction of x k we denote the force by p ki dS 
and call p ki the stress components, which are therefore forces per unit area. The sign is 
specified by taking p ki dS to be the reaction on the matter on the side where x k is smaller 
and tending to increase its i coordinate. The force on the side where x k is greater, tending 
to increase its i coordinate, wall be —p ki dS. 

The stress components have two remarkable properties. They form a symmetrical 
tensor; and they have a simple linear relation to the rate of strain tensor in the case of a 
fluid, and to the strain tensor in an elastic solid provided the strains are small. We shall 
assume them to have continuous derivatives. 
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3*103. We first show that they form a tensor. Let a plane of x\ constant make small 
intercepts of order a on the axes and consider the forces on the small tetrahedron between 
this plane and the x i axes. Let dS be the area of the «'• plane forming the base of this 
tetrahedron. Then p^dS are the forces acting on the interior 
across dS. Now the magnitude of the force acting across the 
face of constant x k is p ki times the area of that face, and the 
area is l kj dS, where l kj is the cosine of the angle between x k 
and x'j. It acts, however, on the matter on the positive side 
of x k = 0 and must therefore be taken with the negative 
sign. 

The matter inside the tetrahedron will in general be acted 
on by external forces such as gravity, which will be of order 
a 8 when a is small. These are called body forces. It will also 
have an acceleration, which we suppose always finite. Then 
the rate of change of momentum is also of order a 3 , and 
the condition that the rate of change of momentum of the element is equal to the 
total force gives 

(P 3 i- l kjPki)dS = 0(a 3 ). 

This relates the forces in the direction of x { . But we can now resolve them in the direction 
of x\\ and 

(hiPjt~ hfkjPki)dS = 0(a 3 ). 

But luPjidS is the component in the direction of x\ of the force across a plane of x '• constant; 
and therefore is the same as p\ x d8 , where p'^ are the stress components with regard to the 
new axes. Also dS is of order a 2 ; hence by making a tend to zero we have 

Pjl ~ lilhjPki = hjlfclPik' 

Therefore p ik is a tensor. 

3*104. Now take a small parallelepiped with centre at x i and sides 8x i} and consider 
the moments about a fine through the centre parallel to the axis of x 9 . First ignore 
variations of the stress components in the 
region. Across the face x 2 + \8x 2 there is a force 
parallel to x x equal to p 21 8x x 8x 3 ; and this has 
moment — p 21 8x x 8x 3 (^8x 2 ). The force across the 
face x 2 —\8x 2 is equal and opposite, but as it is 
on the opposite side it has the same moment. 

The forces parallel to x 2 across the planes of x x 
constant have moment p x2 dx x dx 2 dx 3 . Evidently 
all forces arising from other stress components over the faces have no moment. Hence 
if the stress was unif orm the moment would be 

(Piz-P2i)8x x 8x 2 8x 3 . 

A little consideration will show that the change of the moment due to non-uniformity 
of stress is of order a 4 with the most extreme possible variability, where 8x x , 8x 2 , Sx 3 are of 
order a. The body forces give a total force of order a 3 , and will have a moment of order a 4 . 
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Similarly, the total rate of change of angular momentum, so long as the acceleration is 
finite, is of order a 4 at most. Hence 

(jPl2 Pn)^l^2^Z = 

and by taking a sufficiently small region we show that 


By symmetry it follows that 


P 12 ~ P 21 ' 
Pik ~ Pki> 


and therefore p ik is a symmetrical tensor. 


3*105. Equations of motion. Consider again a small parallelepiped. Let p be the 
density and/ t - the acceleration of the particle of matter momentarily at x t \ and let X t be 
the body force per unit mass. Then JJfp/ i dx 1 dx 2 dx 3 through the element is equal to the 
total force on the element. The body force contributes fffpX i dx 1 dx 2 dx 3 ; and we have to 
consider what contributions arise^from non-uniformity of the stress. The face x 1 + \8x 1 
will contribute p u 8x 2 8x 3 , where p Li is to be given its mean value over the face. But the 
opposite face contributes —p u dx 2 Sx 3 , where p u is to be given its mean value over that 
face. If the stress components are differentiable, which will in general be true, the two 
faces together contribute 

8x x 8x 2 8x 3 + 0(a 4 ). 

Then all six faces contribute 

^r$T+ 0{a*), 


and by taking a sufficiently small parallelepiped we have the equations of motion 


Pfi — P^i + 


dPki 
dx k ’ 


The above argument assumes nothing about the properties of the material except that 
action and reaction between neighbouring portions are equal and opposite, that all 
accelerations are bounded, and that the stress components are differentiable. It is equally 
valid for solids and liquids. Differences between the states of matter arise when we deal 
with the relation between stress and strain. 


3*106. Stress-strain relations. 3*1061. Elastic solid. It is fairly obvious that 
a simple displacement or rotation of an elastic solid to a new position of equilibrium 
requires no change of stress; and therefore that the stress components are independent 
of the rotation.,The fundamental relation is expressed by Hooke’s law, which, in its most 
general form, states that as long as the strain components are small the stress is linearly 
related to them. This is true for the most anisotropic crystals. The usual theory of elasticity 
is for isotropic solids; we then assume a much more restrictive relation; one way of stating 
this is that the stress tensor and the strain tensor always have the same principal axes. 
When a tension is applied along a uniform bar, the bar extends longitudinally and con¬ 
tracts laterally, the changes of length of equal elements in all lateral directions being equal. 
This is expressed, if the axis of x x is taken along the bar, by 


Ee u = Pm Ee 22 = Ee 33 = -crp u . 


(1) 




102 


31061 


Stress-strain relations for solid 

E is called Young’s modulus and a Poisson’s ratio; both are constants of the material. 
In the conditions specified all stress components other than p n are zero, and the three 
strain components e 23 , e 31 , e 12 are also zero. But then if we consider also stresses p 22 , p 39 , 
since the stress-strain relation is linear, we can add the corresponding strains and get the 
more general form 

E e n = Pll~ °’(#22+i ? 33)> e 23 = P 22 — (2) 


with symmetrical relations. The first of these can be written 

Ee n = (1 + <r)Pu ~ <r(Pu+P22 +Pm)> ( 3 ) 

and the whole set are summarized by 

Ee ik = (1 + <r)p ik -crp Tnm 8 ik . (4) 

This set of equations is valid for the set of axes chosen, which are principal axes of the 
stress tensor and of the strain tensor. But if we now transform to any other set of rect¬ 
angular axes, every term transforms according to the rule for second-order tensors and 
therefore we shall get 

Ee' u = (1 +<r)p' il -(rp' nn 8 jl (5) 

Pnn = Pmm* (®) 


Hence the form (4) is not confined to principal axes and is true for any rectangular axes 
whatever. Its usefulness is due to the fact that in most problems of elasticity the principal 
axes of the stress are not in the same direction at all points and we need a form valid for 
all directions. It is convenient if the stresses are known and we have to find the strains 
from them. 

In many problems, on the other hand, the stresses are unknown and the equations of 
motion must be regarded as differential equations for the displacements. Then we need 
to express the stresses in terms of the displacements, and therefore in terms of the strains. 
This can be done as follows. First contract the equations (4); we get 

Ee mm = (1 +<r)Pm m -Z<rPmm = U “ ^)Pmm» (?) 

crE 

( ^ °") Pik = E ^ik "h j_2cr ^ik’ ( 3 ) 

Pik = ^mm ^ik "b (®) 


where 


crE E 

(1 +cr) (1 — 2 <t)’ ^ = 2(1+0-)* 


( 10 ) 


A and [i are known as Lame’s constants. A has no special name; fi is called the rigidity. 
Evidently if a block is clamped along the plane of x 2 — 0 and a tangential stress p 21 is 
applied over the opposite face the block will be distorted. Suppose the displacement to 
be (yx 2 , 0,0). Then y is a small angle and is a measure of the shear. All the strain com¬ 
ponents are found to be zero except e 12 , e 21 , which are \rj, — \t)\ and then from (9) 

P21=/*V> (11) 

so that fi is the ratio of shear stress to shear, and measures the resistance of the substance 
to distortion. 
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Another important constant arises in the case where the strain is spherically sym¬ 
metrical; that is, if e n = e 22 = e 33 , e 23 = e 31 = e 12 = 0. This means that the matter is 
stretched in the same ratio in all directions, and the associated type of stress is called 
hydrostatic. Then (9) gives 

Pu = 3Ae n + 2/ie n = 3 ke n , (12) 

wiiere k = A+f/t. (13) 


k is called the bulk modulus because the relative change of volume is 3e n to the first order, 
so that k is the ratio of the symmetrical stress to the change of volume. It is also known as 
the incompressibility and ljk as the compressibility. 

In terms of A and p 

£ _ fi{ 3A + 2 fi) _ A 

\+p ’ °'~2(X+p)' 


(14) 


Experimentally E, k, and ji are the easiest of the elastic constants to measure directly. 
All have the dimensions of a stress. 


An alternative method is to assert directly that if there is a universal linear relation 


Put — c ikmp e mp 

valid for all axes, c tkmv is a tensor of order 4. If further its components have the «amA values for all 
axes, it is isotropic, ajid therefore, by 3*031, 

^ikmp — A ^ik^mp + P(^im^kp + $i»8km) + V(8 im 8 hp ~ $ip$kmh 

where A, [i, v are scalars. Then 

Pik — A S ik e mm + /i(e ilc + e fci ) + v{e ik — e ki ) 

= A S ik e mm + 2/i e ik , 

since e ilt is symmetrical. 

This method has the advantage that it is possible, by suitable modifications of the method of 3*031, 
to find out what fourth-order tensors have the symmetry properties associated with various types of 
crystal. Then this method can be extended to obtain the stress-strain relations for crystals. 

3*1062. Fluid. In a fluid the mean stress \p mm is nearly always negative and is 
denoted by —p\p is called the pressure. (Contrary to what is stated in some text-books, 
a liquid carefully freed from dissolved gases can stand an appreciable tension; but it is 
true that tension seldom occurs in practice.) In a classical fluid the stress tensor is simply 
and this is a good approximation in many problems relating to real fluids. The 
departure of the stress from this value is linearly related to the rate of strain, and if this 
is now denoted by e ik it is true for a real fluid, as for an isotropic elastic solid, that the 
principal axes of the stress tensor are also those of the tensor e ik . The required relations 
can therefore be written 

Pik + P^ik = ^ ^ik "b ^ik’ (1) 

but we must impose the further condition that by the definition of p the tensor on the 
left gives zero on contraction. Hence 

(3A' + 2/)e mm = 0 (2) 

and Pik = -p\k +- K.W (3) 

p' is called the viscosity. Its dimensions are those of a stress multiplied by a time. The 
function multiplied by 2p' is the departure of the rate of strain from spherical symmetry. 
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The pressure p has important properties. It is nearly independent of the rate of strain, 
though it might theoretically contain a small term proportional to e mm . This, however, is 
so small that it has no practical importance. The pressure can therefore be treated as a 
function of the density and temperature alone, according to the usual laws of thermal 
expansion and compressibility for a liquid or gas. 

3*1063. The acceleration. We have left the acceleration term in the equations of 
motion in the form pU We need to express it, for a fluid, in terms of derivatives of the 
velocity; for a solid, of derivatives of the displacement. Our derivation of the equations 
of motion made use of a parallelepiped fixed in space. We could take an element of volume 
moving with the matter instead, but this would not in general remain rectangular, and 
the resolution of the forces would be much more difficult. But the acceleration does refer 
to a particular particle of the matter. 

To make this explicit it is convenient temporarily to use Lagrange’s way of specifying 
the motion. The particle at x t at time t is supposed to have been at a i at time t 0 ; then the 
motion of every particle is described by an equation of the form 


where, for a given particle, a^ is independent of t. Then the velocity and acceleration of 
the particle are 




'&*i\ 

dt* L’ 


( 2 ) 


where the suffix means that a k is kept constant during the differentiation. 

In the usual Eulerian way of specifying the motion, the velocity of a particle is regarded 
as a function of its position at time t instead of at time t Q . Hence if in time St the particle 
moves from x t to Xi + Sx it its velocity will be v t evaluated at t + St, «* + £»<. Hence its 
acceleration is 


lim 

St —y 0 


Vi(t+St, x i + Sx{) — 
St 




and 8x k /St, in the limit, is itself the velocity of the particle, v k . Hence 



(4) 


The operator dldt + v k dldx k , which gives the time derivative of any quantity associated 
with a particular particle (i.e. a^ constant in Lagrange s specification) is usually denoted by 
D/Dt in English works. It is, however, simply the partial differential operator d/dt, with 
a t kept constant instead of x^. When, as in the Eulerian method, we suppress mention of 
a^ altogether, there seems to be no adequate reason against regarding the operator as 
an ordinary total derivative and denoting it by d/dt. The notation D/Dt is really a survival 
from the time when d/dt was used to denote partial differentiation. 

In a fluid, therefore, the equations of motion have the form 



( 6 ) 


where the stress components are related to the rate of strain and the pressure according 
to 3*102(3). 
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In a solid the conditions are somewhat different, since the initial position appears in 
the specification of the displacements; in fact 

( 6 ) 

But in nearly all problems of elasticity the displacements are small, and if we neglect their 
squares d/dt can be replaced by d/dt. Squares of the strains are also neglected in taking 
the stress-strain relations as linear, so that there is no loss of generality in also neglecting 
them in the acceleration.* 

3-11. Electromagnetic stress tensor. Take the electric forces first. If K is the 
dielectric constant, supposed uniform, E the electric intensity, and p the electric charge 
per unit volume, the electric force per unit volume is 


and 

Then 


47 rX i = KE i 


X = pE 
irrp = K div.E. 
dEu 


^K^-(B ( E k )-K d ^E k 


dxi 


since E is the gradient of a potential; and 

Hence the mechanical force can be regarded as derived from a stress 

Pik = ^ T (E i E k ~lEl l 8 ik ). 


( 1 ) 

( 2 ) 


(3) 


(4) 


(5) 


Now consider the force due to a magnetic field H on a medium carrying an electric 
current of density j. The permeability /1 is taken constant. Then 


X — p~aH, curlJBT = 47rj/c, 

C 

4ttX < = p e ifcm (curl H) k H m 


( 6 ) 


P^ikm^kps -“m 


dH. 




_ J8H ( m r ,\ 


(7) 


* The fullest discussion of the second-order terms is by F. D. Mumaglian, Am. J. Math. 59 , 1937 
236-60. 
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Rotating axes 

312 

But 

tej iBm] dx m Bm + Hi dx m 

(8) 

and dH m Jdx m = 0. Hence 


(9) 


and X i can be derived from a stress tensor 

p ik = ^{.H t H k -\H* m S ik ). ( 10 ) 

For the additional terms required when K and fi are not constant, see Abraham- 
Becker, Classical Electricity and Magnetism , pp. 104, 146. 

3*12. Rate of change of a vector, when the axes are rotating. We have stated 
the transformation rule for vectors in a way that depends only on the mutual inclinations 
of the axes; so far we have not had occasion to consider what happens if these inclinations 
are themselves varying with the time. So long as the transformation from one set of axes to 
another is purely algebraic, there is no trouble; all identities depending on finding the 
component of a vector in a given direction will remain true even if the direction cosines 
themselves are varying, provided that we take all their values at the same instant . But if 
we have to differentiate with regard to the time we are considering different instants, and 
special attention to the variation of the direction cosines becomes necessary. To take a 
very simple case, let a particle be moving with uniform angular velocity in a circle, so that 

x x — a cos cot, x 2 = a sin cot , x z = 0. 

The velocity components are (— coa sin cot, coa cos cot, 0), and the acceleration components 
(— co 2 x x , — co 2 x 2 , 0). Now take a set of rotating axes, x z coinciding with x z , while x[ is 
inclined at cot to x x and therefore permanently directed towards the particle. Then the 
coordinates in the Ol'2'3' system are permanently {a, 0,0), and their rates of change are 
(0,0,0). But the components of the velocity relative to 0123 along these axes are (0, ooa, 0) 
and those of the acceleration (— co 2 a, 0,0). Bates of change of the coordinates with respect to 
sets of axes in relative rotation do not transform according to the vector rule. We can say that 
the operations of differentiation with regard to t and resolution in a given direction com¬ 
mute only if the direction is fixed.* 

The elementary form of the equations of motion of a particle, mx t = X i7 requires the 
axes to be inertial. If we use instead a set of rotating axes x\, the force can be resolved 
along them by the rule for a vector, and the equations are equivalent to 

rnlj^j x^ — l^ — X^j, (1) 

but the left side is not equal to mx }. In dealing with the motion of rigid bodies, especially, 
it is usually convenient to state the equations of motion referred to rotating axes, 

* We shall meet the non-commutative property of operators repeatedly. The simplest case is 
where the operations are multiplication by x and differentiation with regard to xi 

xf(x) = x — f(x) +f(x) 
d 

which is not the same as x —fix). 

ax 
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which we can often take to be fixed in the body, and therefore we require to be able to 
express in terms of Xj and its derivatives; similarly, if A t are the components of dis¬ 

placement, velocity or angular momentum with respect to inertial axes, we need expres¬ 
sions for l {j dA { /dt in terms of l ij A i and its derivatives. We can continue to denote 
by Aj. It will still be true that if B t = A it then j 


even though the axes of x'j are rotating, and conversely. Then 

Ai IjjAj, 

dAi _ , dA) dl {j 


( 2 ) 

(3) 

(4) 


Now if we take a point on the Xj axis at a fixed distance c from the origin, its coordinates 
■with respect to the x i axes are cl {j and its velocity components are cdl^/dt. But the x'j 
axes are a rigid frame; hence the velocity of a point rigidly attached to them with 
coordinates x i is - 0 ik x k where & ik represents their rate of rotation as in 3-092. Hence 
the Xf velocity of the point just considered is —c® ik l kj , and 


—'ii — —P) / 
dt - Uikl U’ 


dA t . dA\ 


dt 
dAi 


_ 7 "^11 — P\ r A' 
dAi 


a dt ~^ ii ~dt 


(5) 

( 6 ) 
(7) 


But — 8 }l , and lal k j® ik is @y, defined as the result of transforming @ ik to the axes 
x j> x i according to the rule for second-order tensors. This gives the components required: 


dAi 


= (A-A^ + A'^, A^-A^ + A^, ^3-A^ + A^). 


( 8 ) 

(9) 


These are the components along x\ of a vector whose components along x t are dAf/dt; 
and it is easy to verify that if we take a third frame of reference the components with regard 
to it will be the same whether we transform directly from the i frame or by way of the 
j frame. These components therefore satisfy the consistency rule for vectors. 

Whether we regard the separate terms dA'/dt and (8 a A) as vectors is a matter of 
definition, and two methods are open to us.the components of a vector we have to 
show that this rule is satisfied; and in this case with this transformation rule it is not. 

The alternative method is to restrict the scope of the transformation rule to axes that 
are not in relative rotation. To simplify matters suppose that A is a displacement from the 
origin. Then dAjdt is by definition a velocity, the components of which are relative to a 
fixed frame. We can, however, imagine a triply infinite set of fixed frames with the same 
origin, and at any moment the moving axes are passing through one of them, which is 
thereby identified. We are entitled to resolve with respect to this frame, and the velocity 
components with regard to it are l^dAJdt, These are the components of a vector. 0 a A 
is the velocity, referred to any of the fixed frames, of a point rigidly connected to the 
moving axes. Such a motion is possible, and in this sense 0 a A is a vector. Then lydAjjdt 
are the components, with regard to the x^ frame, of a vector; but it is not the same vector 
as dAjdt. It is the velocity of a point moving in the x i frame with the part of the velocity 
dAjdt that is not expressed in 0 a A. 

With this interpretation the expressions in ( 8 ) are to be regarded as components of the 
rate of change of A , not along the moving axes, but along the fixed axes that they are 
instantaneously passing through. 

Either of these interpretations is tenable; the important thing to realize is that they 
are not the same thing and they must not both be made at once, otherwise mistakes are 
inevitable.* 

The angular velocity 0y is sometimes called the angular velocity of the rotating axes 
‘with reference to themselves ’, without any clear explanation of why such an angular 
velocity should not be identically zero. On inspection of its derivation we see that it is 
the angular velocity of the axes about fixed axes instantaneously coinciding with them . If 
the angular velocity about any fixed axes is known, 0 # or its vectorial representation is 
found by resolving. 

3*13. Applications to mechanics. Euler’s angles. The position of a rigid body, 
of which one point 0 is fixed, can be specified with regard to a fixed frame of reference 
Oxyz in the following way. We take first a marked 
line of particles in the body and take this as the axis 
03 of a set of rectangular axes fixed in the body. The 
polar angles of this line are denoted by 6 , A. The 
position of the body is now known except for a 
possible rotation about 03, so that it is completely 
fixed if we know the angle between a marked plane 
of the body through 03 and the plane z03. We call x 
this angle x, and the three angles 0 , A, x are Euler’s 
angles. 

We have to express the angular velocity of the 
body in terms of the rates of change of these three 
angles. We use an intermediate frame of reference 
0123, where 01 is in the plane z03 and at right 
angles to 03. Then 02 is normal to the plane z03. All frames are taken to be right- 
handed. 0123 is not in general fixed either in space or in the body. But if 031' is a fixed 

* A similar point arises in what is called ‘covariant differentiation’ in general relativity. Cf. 
Eddington, Mathematical Theory of Relativity; McConnell, Absolute Differential Calculus . 
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plan© in the body making an angle x with 031, we can take a third axis 02' perpendicular 
to 031', and this also is fixed in the body. 

Evidently the rate of rotation of the body is specified by rates A about Oz , # about 02, 
and x about 03. These can be resolved about any convenient directions. Their com¬ 
ponents with regard to 0123 are clearly ( — sin# A, 6, x + cos# A). Resolving again with 
regard to Ol'2'3 we get the components < 

(— sin# A cos x + $ sin x> sin/9 A sinx + tfcosx, X + cos #A), V v 

which are the components of the angular velocity of the body about axes fixed in the 
body. 

In many actual problems the body has an axis of symmetry, which can be taken as 03. 
It is then convenient to use the frame of reference 0123 rather than 01'2'3 on account of 
the simpler relations between the components of angular velocity and Euler’s angles. 
There is one important difference, since we shall see that it is necessary to consider both 
the rotation of the body and that of the frame of reference, when the latter is not fixed 
in space. If we denote the angular velocity of the frame of reference by 0(# 1? # 2 , # 3 ), 
then if the frame is fixed in the body 

0 = 0 ). 

But if we use the frame 0123 this has not the angular velocity component x> since 01 
always remains in the plane z03. Hence for it 

( d v # 2 ,# 3 ) = (— sin# A, cos# A), 

while the components of the angular velocity of the body are then 

(o> v a> 2 , o) z ) = (-sin#A, #, x + cos#A). 

The relations 0 1 = a) l9 # 2 = evidently mean that the axis 03 has the same angular 
velocity as the line of particles occupying it; that is, that 03 is fixed in the body. The fact 
that # 3 + is a reminder that 01 and 02 are not fixed in the body, 

3*131. To illustrate the method, let us calculate the acceleration components of a 
particle in spherical polar coordinates (r, #, A). We take the axis 03 towards the particle; 
the axes 01 and 02 are taken as in 3-13, and the components of angular velocity of the 
axes are ( — sin# A, 6, cos# A). As the particle is permanently on the axis 03 its velocity 
components are (0,0, r) + (# x , # 2 , # 3 ) a (0,0, r) 

= (r$,rsin#A,r), 

as may also be seen by inspection. Then the acceleration components are 

(®i> « 2 > ^a) = ^ (rd, r sin# A, f) + (# x , # 2 , # 3 ) a (r&, r sin# A, r), 

a x = ^ (r6) + r# — r sin # cos# A 2 = r(# — sin # cos# A 2 ) + 2 r6 t 

a 2 = v; (r sin# A) + cos# Xr& + sin# A r = —r—^^-^sin^A), 
at r sm # dt ' 

a 3 = f — rsin 2 #A 2 —r# 2 = r — r(& 2 + sin 2 # A 2 ). 
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3-14. Euler’s dynamical equations. Let Ol'2'3' be taken along the principal 
axes of a rigid body moving with O fixed. If A, B, C are the principal moments of inertia 
and <*> is the angular velocity of the body about inertial axes, the angular momentum 
about 0 is h(0) = (A(o lt B<a 2 , C<o 3 ). The rate of change of angular momentum about 0 is 
therefore 


dh 

dt 


-f- co a h, 


with components (Ad) 1 — (B—C)o) 2 a) 3> etc.) about the instantaneous positions of the 
principal axes. 


3*141. Motion of a top. The axis of symmetry is taken as 03; using the axes 0123 
as in 3-13, the angular velocity of the top is(-Asin<9, 6, x+Xcosd) and that of the axes 
is (—A sin#, 6, A cos d). The moments of inertia being A, A, G, the components of angular 
momentum are { - ,4A sin d, Ad, G(x +A cos 6)}. 

If N is the moment of the external forces 


dh „ , 
dt +6A k = N 

If the top is moving under gravity and its centre of mass is at (0,0, h), 

N = (0, Mgh sin 6, 0). (2) 

The third component of (1) gives immediately 

A+Acosi 9 = const. = n, (3) 

say. The second component gives 

Ad + CnX sin d—AX 2 sin d cos d = Mgh sin d, (4) 

so that the condition for a steady precession with d = a and A = Q. is 

AQ. 2 cos a — CtiD, + Mgh = 0. 

For the general motion the simplest method is to notice that there are two other first 
integrals: 

(1) The angular momentum about the vertical is constant, whence 

A sin 2 dX + Cn cos 0 = constant. 

(2) The total energy is constant, whence 

(A 2 sin 2 d+6 2 ) + G( A cos d + x) 2 + 2Mgh cos d = constant. 

Once the expressions for the kinetic and potential energy have been found, the equa¬ 
tions of motion can be found by using Lagrange’s equations (10-07). 

3*15. Tensors in two dimensions. If the only rotations of the axes permitted do 
not displace the axis of x 3 , the possible changes of axes are more restricted than for a 
general rotation, and additional sets of functions are found to have the requisite trans¬ 
formation properties under the changes permitted. In fact if u t (i = 1,2) is a vector in 
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two dimensions its component in a direction a is % cos a + u 2 sin a. Now there is a vector 
v whose component in direction a is equal to that of u in direction ^n+cc. For this is 
equivalent to 

i^cosa+^sina = cos (£zr + a) + m 2 sin (\n + a) 


and therefore if 


= — Mjsina+Ugcosa, 


Vi = u 9 


Vo = —u 


i> 


( 1 ) 
( 2 ) 

v i and v 2 are the components of a vector, which are the components of u referred to axes 
rotated through a right angle. In particular (x 2 , —Xj) are the components of a vector. 

But the derivative of a vector is a tensor of the second order. Taking this vector to be 
(x 2 , —Xj) we have that 


= dv k = (o -i\ 

Vik dx t \1 0/ 


(3) 


is a tensor in two dimensions. Thus 8 ik is not the only isotropic tensor of order 2 in two 
dimensions; and any linear combination 

, . , , . Ma+Wa, (4) 

where A and fi are scalars, is another. 

Now consider the fourth-order tensor r/imVkp • All components with i = m or k =* p 
vanish, and 

V12V12 = ^21^21 = 1 ) 1 
V12V21 =— 1 .) 


(5) 


Now VimVkp is a second-order tensor if ^ is a scalar. But if i = k = 1 the only 

non-zero terms are for m = p = 2, and the component is d 2 (f>jdx\. If i = 1, k = 2. we must 
take m = 2, p = 1 and the component is — d 2 ^/dx 1 dx 2 . Proceeding, we see that 


d 2 (j> 

d 2 <f> 

dx\ 

dx-L dx< 

d 2 <j> 

3 *<f> 

dx 1 dx 2 

dx\ 


( 6 ) 


are the components of a second-order tensor in two dimensions. This tensor has applica¬ 
tions in elasticity, particularly in relation to the bending of thin plates and the distribution 
of stress between parallel planes. 

3* 16. Parallax. Tensor methods are sometimes useful in obtaining the formulae of 
spherical astronomy. Consider the parallax of the 
moon or a planet. Take the origin at the centre of 
the earth, axis 3 towards the pole, axis 2 on the ob¬ 
server’s meridian. Let the coordinates of the planet 
be x f = rl t , those of the observer = oA*. The distance 
from the observer to the planet is R, given by 

B 2 = (x<-^) 2 = r 2 -2raA k l k +a 2 , 

R = r—aX k l k , 
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to the first order in ajr. Then the direction cosines of the line from the observer to the 
planet are l\> where 

= l i + P(-A i + l i A k l k ), (3) 


where P — a/r, the horizontal parallax. Now the direction cosines are given in terms of 
the angular coordinates by 

l x = cos # sin A, Z 2 = cos 8 cos A, Z 3 — sin 8 , 

A x = 0, A 2 = cos0, A 3 = sin0, 

and A* l k = cos <j> cos 8 cos A -f sin <j> sin 8. (5) 

For the parallax in declination w 8 



Zg "" Zjj — COS 8 UJ ^ 



= P{ — sin0 + sin#(cos0 cos# cosA-f sin0 sin#)} 

= P(sin 8 cos 8 cos0 cos A — sin <f> cos 2 #), 

(6) 

whence 

m 8 = P(sin 8 cos <j> cos A — sin 0 cos 8). 

(?) 

If A' is the observed hour angle 



tan A' = Z^/Z 2 , tan A = Z x /Z 2 , log tan A — logZ 1 ~logZ 2 , 

(8) 


f°*V h) = Vi ~ h l *~ h -p(-^+¥) 

tan A Zj Z 2 \ Zj Z 2 / 



p COS0 

cos 8 cos A * 

(9) 


A' —A = Pcos^ sinA sec#, 

(10) 

and f 

the parallax in right ascension, is — (A' — A). 



EXAMPLES 


1. Show that (K.A). (K. B) a (K. C) = \\K\A.BhC. 

2. Show that curl (<f>A) = <j> curl A + grad 0 a A. 

dAi dB t 

3. Show that {curl(AAB)}, = A t divB-BidivA +B k —— A k -A. 

OX uXfc 

4. If K ik is a tensor, prove that 


X„ 

Kia 

+ 

X„ 

X»i 

+ 

Xu 

K xz 


x s2 

k 33 


X lt 

X„ 


K 2 i 




and is a scalar. Relate this scalar to an invariant of the roots of the equation || K. ile — [j = 0. 

6. If the stress tensor is 2 A t A k — 8 i1t A 2 , where A t is a vector function of position, A 2 denotes 
A\ + A\ + A\, and d ik equals unity if i = k and equals zero otherwise, verify that at any point the 
direction of A { is a principal direction of stress, and show that the stress at this point can be repre¬ 
sented as a tension A 2 along the direction of A { and a pressure A 2 normal to this direction. 
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6. A body is deformed by internal stress so that the particle which in the strained state is at the 
point (x, y, z ) referred to a set of fixed rectangular axes has undergone the displacement whose com¬ 
ponents are 

3/c(2x-y + z), -3 tc(x + y), k(Zx + 5z), 


where k is a small constant. Show that there is no rotation of any small portion of the body in the 
neighbourhood of (x y y>z)> that one principal extension is 3 k, and determine the other two principal 
extensions. (Prelim. 1941.) 

7. For a chain of three uniform mutually perpendicular rods each of mass m and length 2a, show 
that the equation of the inertia quadric at the mass centre referred to axes parallel to the rods may be 
written in the form 

fma 2 (5x 2 -f 5y 2 + 3 z 2 — 3yz + 3 zx + xy) = K . 

Deduce that one principal moment of inertia at the mass centre is ma 2 and find the others. 

(Prelim. 1940.) 

8. Two equal uniform cones of height h are placed with their bases, of radius a , in contact. Deter¬ 

mine the inertia tensor at the common centre of the bases, and give the ratios of the principal moments 
of inertia. (i.c, 1943.) 

9. Determine the inertia tensor at the centre of one end of a uniform solid right prism of mass M 

and length 2H, whose ends are equilateral triangles of side a. Determine the principal moments of 
inertia when (1) a = H, (2) a = 10 H. In the second case show that the moment of inertia of the 
prism about any of the end edges is ^ MH *. (j q 1940.) 

10. Motion relative to the earth . The position vector of a particle relative to a point O at the earth’s 

surface is r; the velocity and acceleration of the particle relative to a frame at rest relative to the earth 
at O are r and r. If the particle moves under the earth’s attraction and a force F per unit mass, prove 
that * 

v + 2(0 at* =g(r)+F, 


where co is the earth’s angular velocity andg(r) is the acceleration relative to the frame at O of a freely 
falling particle instantaneously at rest relative to the frame at the point r. 

If a particle is projected with velocity Ffrom 0 at time t = 0, show that to the first order in <0 its 
position vector r at time t is given by 


r = Vt + Igt* - t 2 io a V— jt s a>Ag, 

where g = g(0). 

Foucault 8 pendulum. Small oscillations. The origin is taken at the point of suspension, i is 
the direction vector towards the bob of mass m. Then if T is the tension in the string, to the first order 
in to 

T 

i*+2cj>Ar = g- 1 . 

m 

Putting g — gz and i = z + p, show that to the first order in p, 


p-2oJsin AzAp-f = 0, 
t 

where A is the latitude of O, and l is the length of the string. 

By taking the components of this equation in two perpendicular directions in a plane perpendicular 
to z, use the method of 2* 12 to show that the plane of oscillation rotates about the downward vertical 
with angular velocity w sin A. 


JUP 
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Chapter 4 

MATRICES 

‘All that isn’t Belgrave Square is Strand and Piccadilly.* 

w. s. gilbert, Utopia Limited 

4*01. Introduction: definitions. In considering tensors of the second order in three 
dimensions we have used an abbreviated notation K ik for the set of nine quantities 


*11 

K 12 

-^is\ 

K 21 

^22 

-^23 | > 

i-^31 

-^32 

kJ 


When they are displayed in this way the first suffix refers to the row and the second to the 
column. Such a set forms a tensor if each suffix refers to one of a set of axes, of the same 
system of reference, and the coefficients have certain transformation properties when the 
axes are rotated. We have so far considered rectangular axes only, but this is a particular 
case of a more general tensor algebra. The generalizations take three lines: (a) to tensors 
of any order in n (especially four) dimensions, still referred to rectangular axes; (b) to 
tensors of any order in curvilinear coordinates; (c) to the study of an algebra of square 
arrays of quantities, which has many formal similarities to that of tensors of the second 
order, but is not restricted to an interpretation in terms of axes of reference. 

When we take the algebraic point of view we speak of an array of quantities 


*11 

*i 8 

• • • n 

^21 

A 22 

• • • -^2 n 

Kml 

-^m2 

... K mn 


as a matrix of order* mxn . It has m rows and n columns. We may write it as (K ik ), the 
first suffix referring to the row and the second to the column. A square matrix of order 
n x n is a particular case. So is a single-column matrix of m rows (order m x 1) and a single- 
row matrix of n columns (order 1 x n). These three types of matrix have many applications 
in physical theories. 

We shall denote a matrix by a single symbol, which stands for the set of mxn quantities 
or elements expressed in the matrix. For such an entity what we mean by addition, sub¬ 
traction, multiplication, and division is a matter of definition. The use of heavy type is 
an indication that one or more suffixes are suppressed, as in vector and dyadic notation. 

Addition . The sum of two matrices a and b is written as a -h 6 and stands for the matrix 
with elements a ik + b ik . 

Subtraction . The matrix —a is defined as the matrix with elements — a ik , and a — b is 
defined as the matrix with elements a ik — b ik . For addition and subtraction to be signi¬ 
ficant the matrices must have the same number of rows and the same number of columns. 

* A tensor of order 2 as written above would thus have to be spoken of as a matrix of order 3x3. 
The word order has quite distinct meanings as applied to tensors and matrices. A tensor of order 
greater than 2 cannot be written as a matrix. 




Multiplication 

It is clear that for addition of matrices the associative law 
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(a+b) + c = a+(b + c), (1) 

and the commutative law a+b = b+a (2) 

both hold. 

Multiplication. The law of multiplication is that ab is the matrix with elements (aft) «. 
given by lle 

( a b)ik = a ijbjk> ( 3 ) 

the summation convention being understood. Tor this to be significant j must range over 
the same set of values in both factors, and therefore the number of columns of a must be 
equal to the number of rows of 6. The product is then a matrix, with as many rows as a 
and as many columns as b. In particular, if a is an n x n matrix, 6 must have n rows; but 
b may be a single column, and then the product matrix will also be a single column. If 6 is 
also an nxn matrix, ab will be an n x n matrix. On the other hand, if a is a single column, 
b must be a single row and ab is the matrix whose elements are a A- is a single row 
(lxn),b must have n rows. If 6 is square the product is then a single row; if 6 has only 
one column the product has one row and one column, that is, it is a single quantity. The 
important cases for us are therefore, using single suffixes for matrices of one row or column 
and double ones for square matrices, 


b ab 

_ » 


Columns 

Rows 

Columns 

Rows 

Columns 

In 

suffixes 

n 

n 

n 

n 

n 


n 

n 

1 

n 

1 

a ii bj 

1 

1 

n 

n 

n 

d j b i 

n 

n 

n 

1 

n 

a i bjli 

n 

n 

1 

1 

1 

a fbj 


The third case does not involve the use of the summation convention and is called the 
outer product of a and b. The others are all called inner products. 


In works on algebra single-row and single-column matrices are often called vectors. This differs 
rom the physical use of the word vector. In algebra the vector is a set of n elements and has nothing 
to do with any particular transformation law. In physics the vector requires both the elements and 
the assertion of a particular type of transformation law for its specification. Thus in physics we speak 
of the some vector as having different components in different systems of reference; the algebraists 
would call these representations themselves different vectors. We shall avoid this usage. 


The commutative law of multiplication does not necessarily hold even if a and b are 
square. For ba must be defined as the matrix whose elements are b^a^, and this will be 
equal to a^b^ only in special cases; in general 


ab+ba. 

Pairs of matrices that satisfy ab — ba are said to commute, those that satisfy ab = —ba 
to anticommute. 

The rule of multiplication is that the factors are always arranged so that repeated 
suffixes are adjacent. a {j b jk = b jk a ip but this cannot be contracted to ba because the two 
j’ 8 are not adjacent. Explicit statement of the suffixes cannot lead to contradictions, 
but if they are suppressed contradictions will arise unless we have a definite rule about 


8-2 
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Unit, null, transposed matrices 

where the repeated suffixes are supposed to be. The rule suffices to distinguish ba from 
ab while maintaining the order of the factors in the explicit expression for the product. 

The associative law (ab) c = a(bc), ( 4 ) 

and the distributive law a(b + c) = ab -f ac ( 5 ) 

hold, provided the order is maintained and the operations are significant. These are easily 
verified by writing out the elements explicitly. For the first 

{(ab)c} u = ((lijbjkjCjd = toijfijkCfo) — {a( 6 c)}tf. ( 6 ) 

In consequence these products can be written without brackets as abc, since the 
position of the brackets is irrelevant. It follows that all positive powers of a given matrix 
commute; for a 2 o = aa 2 , and a m a n = a n a m follows by induction. 

The unit matrix will be denoted by 1 and has components where 

$ik “ 1 (i = fc), S ik = 0 (i + &), ( 7 ) 

and we shall write $ik for its components. It is clear that 

la = al = a. (8) 

The unit matrix is often denoted by /, or even by 1 in cases where no ambiguity can 
arise. 

The null matrix is one all of whose elements are zero. 

The product of two matrices may be null without either factor being null. Thus 

c :)c 5-c :)■ 

The non-zero element of the first can be multiplied only into elements of the first row of 
the second, which are both zero. But if AB = 0 whatever B may be, then A = 0; similarly, 
if AB = 0 whatever A may be, then B = 0. 

The transposed matrix of a matrix a is the matrix formed from a by interchanging its 
rows and columns. We shall denote it by & and its elements by d ik . Then 

^ik = a ki * ( 9 ) 

Since (db) ik = (^ijbj k = a^b^ = b^a^ = (faa) ki , (10) 

it follows that the transposed matrix of the product 06, denoted by ab , is equal to the 
product 65 , in this order. 

Note that it would be possible to define (ab) ik to mean a^b^; this matrix is in fact often 
required, but then we should have 

(ab.c) ik = (ab)ijC kj = a u b it c kii (a.bc) ik = a^(5c) w = a^b^Cy, 

which are not the same. The device of summing over adjacent suffixes is needed to make the 
associative law of multiplication true . We therefore write 

a i} b kj , not as ( ab) ik , but as a tj B jk = (ah) ik . 

Beginners sometimes find multiplication of matrices easier if they first transpose the 
second matrix and multiply rows into rows. 
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If a is a single-column matrix, & is a single-row matrix, and conversely. We have, if 5 
is a square matrix, 

( fta )i = h u a i = a i b n = afoi = (d6) it (11) 

so that the same rule applies as for square matrices. 

The conjugate complex matrix. The elements of a matrix may be real or complex. The 
matrix whose elements are the conjugate complexes of those of o is denoted by a* and 
its elements are a* k . 

The transposed matrix of the conjugate complex matrix, d*, is denoted by at and its 
elements are 


«?* = a ti- (12) 

Symmetry properties. A matrix is said to be symmetrical if it is unaltered by inter¬ 
changing rows and columns, that is, 

- - (13) 

(14) 


or 


a ik ~ a ki 

a = d. 


It is antisymmetrical or skew-symmetrical if the sign is changed when rows and columns 
are interchanged, that is, 


a ik — ~ a 


'lei* 


a = -d. 


(16) 

(16) 


A complex matrix is said to be hermitian if it is equal to its transposed conjugate 
complex, that is, ° 

a = at, (17) 

and antihermitian if a = — at. 

For real matrices (17) and (18) reduce to (14) and (15) respectively. If a is hermitian, ia is 
antihermitian. 

A diagonal matrix is one all of whose elements are zero except those in the leading 
diagonal, i.e., a n , a 22 , ..., a nn . All pairs of diagonal matrices commute. 

The adjugate matrix and the reciprocal matrix. If we consider the determinant formed 
by the elements of a square matrix a 

II a ik || s deta, ( 19 ) 

each element a ik has a cofactor, which we may denote by A ik , and 

a ik A jk = (deta)^. (20) 

To preserve the rule about summing over adjacent suffixes we rewrite this by forming the 
adjugate matrix adj a, 

(adj a) ik = A ki . (21) 

Ka is symmetrical or hermitian, so is adj a. Further, provided that det o does not vanish, 
we have 

a ij A kj _ A H a ik 

deta xh deta 

If therefore we define the reciprocal cr* 1 of a matrix a by 

(«-*)<* = At 


( 22 ) 


1 ki 


we have 


deta’ 

1 )j* = ^ik ~ ( a ~ 1 hj a jk» 

aa- 1 =a- 1 a = 1. 


or 


(23) 

(24) 
(26) 
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The product of a matrix with its reciprocal is the unit matrix in whatever order the 
multiplication is carried out; provided the reciprocal exists, the order does not matter. 
If, however, det a vanishes, a has no reciprocal and is said to be singular. 

Note that we must write (a” 1 )#, not a^ 1 ; the latter would mean the reciprocal of the 
ij element of a. 

Division . Division by a non-singular matrix may now be defined as multiplication by 
its reciprocal, but the quotient depends on the order, as for the product. o~ 1 6 is not the 
same as 6a” 1 . 

Reciprocal of a product. Since 

abb-ia-' = ala 1 » 1, (26) 

it follows that b-'a -1 is the reciprocal of ab, that is, (ab)- 1 . Informing the reciprocal of a 
product the order of the factors must be inverted as in forming the transpose of a product. 
Written in suffixes (26) takes the form 

a ij fy&( a dj b) M (adj a) lm — a {j (det b) £#(adj a) lm 
= <%(adj a) jm (det b) 

= (deta)(det&)£ im , 

(ab)ik(b-'a-i)km = hm- ( 27 ) 

Unitary and orthogonal matrices. A matrix such that 

oa+ = a+a = 1 (28) 

is called a unitary matrix. For such a matrix we have 

a _1 oa+ = a-'l, whence a+ = a™ 1 , (29) 

and hence for a unitary matrix the reciprocal is the same as the transposed conjugate 
complex. A matrix satisfying 

a = a- 1 ( 30 ) 

is said to be orthogonal. A real unitary matrix is evidently orthogonal, since then a+ = dL 
If a unitary matrix is also hermitian, then a+ = a and 

a = a~ 1 , ,(31) 

from which it follows that oa — a 2 = 1. 

4*02. Solution of linear equations. A set of linear equations 

<*11*1 + a 12 x 2 - f... + a ln x n = y v ' 
a 2 iXi + a 22 x 2 +...+a 2 n x n = y%> 

O n l^l + a n 2 X 2 +...+a nn X n = Vnu 
may be written in the abbreviated form 

= V* 

where i and^* run from 1 to n. If we think of x i as a matrix with a single column (2) may 
be written as 

ox * y, ( 3 ) 


( 32 ) 


( 1 ) 


( 2 ) 
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where y is also a matrix with 
and denote the cofactor of a 

a single column. We assume at present that a is non-singular, 
ik by A ik . 

Now since 

= (det a)d kj . 

(4) 

we can multiply the ith equation of ( 2 ) by A ik and add, and the sum will be 



(deta) S k jXj — A ik y iy 

(5) 

that is, if det a 4 = 0 , 

** = det> = 

( 6 ) 

or 

a 

ii 

(7) 

which we might have got from (3) directly by multiplying both sides in front by a 

Any of the sets of equations (5), ( 6 ), (7) gives the solution of the equations ( 1 ) 
compact form.* 

1 

in a 

If the matrix a is unitary ( 6 ) and (7) become respectively 



X k ~ MkiVi* 

( 8 ) 


x = aty. 

(9) 

If a is orthogonal 

x k = a ikVii 

( 10 ) 


t 

II 

8 

( 11 ) 


4’021. Multiplication of determinants. We have assumed that the reader is already 
familiar with the rule for multiplying determinants, but the following proof is interesting 
because it brings it into direct relation with the existence of solutions of homogeneous 
linear equations. Let a ip b if be typical elements of two determinants. Then II a-. II = 0 
is a necessary and sufficient condition that the equations 

Q'ijKj — 0 

have a set of solutions x f different from 0. Multiply by b ik and add; then the equations 

b ik a ijXj = 0 

have a set of solutions different from 0, and therefore if 


Similarly, if || b (j || = 0, 


IKII=°> II b ik a iJ II = 0. 

\ a tk b t) \ = 0; and ||&f fc a^|| = |j a ik b ij [], 


since one of these is the transpose of the other. Hence «6 a aJ contains |o„l, II6-J 

as factors; and by comparing coefficients of a u 6 11 o M 6 M ... a nn b nn we see that the other 
tactor is 1. 


In matrix notation 


a H b ik — &ji b ik = (&b) jk and det (clb) = detd det 6 = deta det b. 

to” “* -a-i W. give . 
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4*03 • Transformations. If we have a set of relations expressed by 

y = ax , (1) 

where a is a non-singular square matrix, we can infer as in the last section that 

x = a~*y. (2) 

If the x$ are variables the matrix a can be said to effect a transformation of variables. 
Two specially important cases are those where a is respectively orthogonal and sym¬ 
metrical. 


4*031. Orthogonal transformations. For transformations of rectangular axes we 
have had the rules 

Xj — lij%i (1 ) 

and x i = lyXj. (2) 

The latter is already in the form of a matrix product, x t and x] being matrices of one 
column. It can therefore be written 

X = lx'. (3) 


But then we can suppose these equations to be solved to give the Xj in terms of the x t . 


and the solution will be 

x' = l~ x x (4) 

or x\ = (I"%av (5) 

Comparing this with (1) and noticing that they must be equivalent for all values of the 
x i we have 

~ hp (®) 

and therefore l is an orthogonal matrix; for 

I 1 - l (?) 

(1) can also be written £' = £l . (8) 


We must write £, not x , because a single-column matrix cannot come first; we must 
therefore take the transpose of x to give a single-row matrix. Thus a rotation of axes can 
be expressed as an orthogonal transformation of determinant 1. Now 


£x = x t x t (9) 

and x'jx'j = xll^x « £lx = xx = zfa. (10) 

This verifies that the form x^ is invariant under an orthogonal transformation. 

Conversely, if x t = l tj x] (11) 

and x i x i = (12) 

for all values of Xj, we have (hj x j) (hi x i) — (13) 

and therefore, since is a symmetrical matrix, 

= ll = !• ( 14 ) 


Hence I is an orthogonal matrix. 
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On the face of it (6) represents n 2 relations between n 1 unknowns, and we might expect 
only a finite number of solutions to exist, and even these might be complex. But what¬ 
ever l might be, Z y Z a is symmetrical in j and l, and the sufficient condition for ortho¬ 
gonality therefore requires only %n(n+ 1) independent relations. Thus apparently the 
components ly might be made to satisfy \n{n — 1) additional conditions, and this n um ber 
is actually correct. 

Now suppose that we keep the same axes x t but rotate a rigid body about the origin. 
Imagine a set of axes *'• to have originally coincided with the x i ones but to be fixed in the 
body and therefore to be rotated with it. Then the coordinates of any particle of the body 
with respect to the x'j axes are unaltered. We require its new coordinates y t with respect 
to the x x axes. If the direction cosines are denoted by ly as before we have 

Vi = hjZj = kjXj, y = te, (15) 

and similarly any vector u on rotation becomes v, where 

v = lu. (16) 


4*032. 2x2 orthogonal matrix in terms of one parameter. Now let I denote the 
orthogonal matrix 



(y % 

(17) 

Then 

lt_( a A/ a V\ IcP+P* ay+flS\ /I 0\ 

\7 *)\fl i) \ya+Sfl y 2 + 6 2 ) ~ \0 l)’ 

(18) 

Then we can choose A so that 

a = cosA, j3 = sin A, 

(19) 

and then 

y/S = — fi/a — — tan A, 

(20) 

which, with 

y 2 + S 2 = 1, 

(21) 

gives 

y = —sin A, d = cos A, 

(22) 

or 

y = sinA, $ = — cos A. 

(23) 

(22) gives 

I / cos A sinA\ 

\ — sin A cos A/' 

(24) 

involving one adjustable constant. A = 0 gives 1 = 1 . 


(23) gives 

j /cosA sinA \ 

~\sinA —cosA / 

(25) 

A = 0 gives 

-C-J- 



If we regard x v x 2 as rectangular coordinates in two dimensions and if we substitute (24) 
in (15) we obtain a rotation through — A. (23), with A = 0, leaves x x unchanged but 
reverses the sign of x 2 . In general, (25) represents a reflexion. 
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4-033. General orthogonal transformation. In n dimensions, if we take 
1= / cos# sina 0 0 ...\ , l = /cosa —sina 0 0 . 

— sina cosa 0 0 ... \ / sina cosa 0 0 . 

0 010... I 10 0 10. 

0 001 .../ \ 0 0 01 . 


we find at once that I? = 1, and 1 is orthogonal. The transformation can be regarded as 
a rotation on the plane of the axes of x t and z 2 , leaving coordinates in the other n~2 
dimensions unaltered. We cannot speak of the rotation as being about any axis, as we can 
in three dimensions, but we can continue to say that it is parallel to a plane. Such a 
rotation, of arbitrary amount, can be made on any plane including two of the axes; hence 
\n{n— 1) independent rotations are possible in n dimensions. 

The independent components of an antisymmetrical matrix have the same number, 
and it can actually be shown that the elements of an orthogonal matrix can always be 
expressed in terms of those of an antisymmetrical one, but the proof is rather long.* 

4-034. General rotation of a rigid body. In three dimensions the matrix expressing a 
rotation through an angle a right-handedly about an axis with direction cosines n t is, by 3* 09, 

cos a + n \(1 — cos a) n x n 2 (\ — cos a) — n 2 sin a n x n z ( 1 — cos a) + n 2 sin a\ . 

n x n 2 (\ — cos a) + n z sin a cos a + n|(l — cos a) n 2 n 3 ( 1 — cos a) — n x sin a 

\n 3 %(1 — cos a) — n 2 sin a n z n 2 ( 1 — cos a) -b n x sin a cos a -b n\{ 1 — cos a) 

Let 0123 be a set of axes and x i the coordinates of a 
point of a body referred to them. Let the body be ro¬ 
tated into a position specified by Euler’s angles, which 
we shall denote by #, A, X- We require the final position 
of the particle that was at x t before rotation. Suppose 
the body first rotated right-handedly through # about 
02; particles originally on the axes move to 1'23\ The 
matrix for this rotation is 

( cos # 0 sin 0\ 

0 1 0 

— sin# 0 cos#y 

and a particle at x moves to y = l x x. Now rotate 
through A about 03 (not 03'). The particles at T23' 
move to 1"2"3" and the general particle moves to z, where 



z — l 2 l x X f l 2 



/cosAeos 0 —sinA cosAsin* 
l 2 li = [ sin A cos# cos A sin A sin# 
— sin# 0 cos# 


* W. L. Ferrar, Algebra , pp. 162-7; or see Ex. 10, p. 169. 
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This important matrix gives the coordinates of a particle whose position with regard to 
the axes 01"2"3' is known, and may be regarded as the basis of most of the formulae of 
spherical astronomy. For a rigid body, however, the positions of all particles are not 
specified without a third angle; 6 and A can be chosen to specify one line of particles in 
the body, but those originally on 01 need not finish on 01". We must therefore consider 
a further rotation x about 03". The working out of the matrix for this, followed by its 
multiplication into l 2 l lt is not a matter to be undertaken if there is anything else to do. 
It can, however, be avoided. The displacement from 0123 to 01*2*3' is a rigid-body 
displacement. If the displacements A and 0 were undone, in this order, the plane 3"1'" 
would return to a position still inclined at x to the plane 31. Hence we can allow for x by 
making a rotation about 03 first and then applying the rotation l 2 l v The matrix for 
the former is 

( cos x — siny 0\ 
sin x cos* 0 

0 0 lj 

and the complete rotation is given by 

( cos A cos 6 cos x~ sin A sin x, -cosA cos# sin^-sinA cos^, cos A sin (9N 
sinA cos# cos ^ + cos A sin^, — sin A cos 0 sin x +cos A cos x> sin A sin d 
— sin 6 cos x sin 6 sin x, cos 0 

Thus, for example, a point originally at (a, 0,0) will finish at 

“(cos A cos 0 cos x sin A siny, sinA cos# cos^+cosA sin^;, —sin (9 cos ^). 

4*04. Symmetric matrices. We have already seen that a 3 x 3 real symmetric 


direction cosines with respect 

to the original axes are the l i} already found in 3-08, and 


£ 

II 

<T 

#. 

a 

ii 

fr 

(1) 

Then 

Kik X i X k ~ hfikl^ik X 'j X 'l 



= K’il x 'j x k, 

(2) 

where 

* 

K'jl = hfikl^ik = tji &ik hd> 

(3) 


K' = iKl. 

(4) 

But comparing with 3-08 (12) 
the same, equal to 1 say. 

we see that if j and l are different K# = C 

1 ; and if they are 


K xx = A ilftln = Aj. 

(5) 

Thus 

/Ai 0 0\ 



o 

N 

o 

II 

k 

(6) 


0 A, 


and the transformation (3) has reduced K to a diagonal form. 
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This result can be extended. A real symmetrical matrix of any order can be reduced 
to diagonal form by an orthogonal transformation of the form (4). A still further extension 
is that a hermitian matrix of any order can be reduced to diagonal form by a unitary 
transformation of the form K' = PH. 

In considering the general motion of a fluid we have seen that part of the motion in a 
small neighbourhood is represented by a symmetrical matrix, and that this part expresses 
the changes of distance between particles. Consider then the transformation expressed 
by a symmetrical matrix K; we take 

Vi — KyXj = XjK^; (7) 

y = Kx\ y = £K. ( 8 ) 

Then yy = xKKx. (9) 

For a symmetrical matrix KK is not in general equal to 1, and 

yy*$x. 

But if we refer to principal axes of K we shall have 

y = ly\ x = lx', y = y l, £ = ( 10 ) 

y' = ly - iKlx' = K'x', ( 11 ) 

and y i — y 2 = y$ = A 8 a; 3 , (12) 

since K r is in diagonal form. The displacement therefore changes the lengths of three 
perpendicular marked lines in the body but not their directions; all lengths parallel to 
each of these lines are altered in the same ratio. Such a deformation is called a pure strain . 

We shall carry through the general reduction of an n x n hermitian matrix to the 
diagonal form, of which the reduction of a symmetrical matrix is a particular case. This 
problem occurs in many branches of mathematical physics. Before considering it, how¬ 
ever, we must consider the solution of a set of homogeneous linear equations and some 
properties of determinants. 

4*05. Rank of a matrix. Homogeneous linear equations. In general a set of n 
linear equations in n unknowns, with n constants on the right, has a unique solution. 
But if we take such a pair as 

x + y = 1, 2a?+2 y = 3, 

there is no solution. This fact is associated with a property of the coefficients; for whereas 
the pair 

x + y = 0 , x-y = 0 

has no solution other than x — y = 0, the pair 

a? + i/ = 0, 2a?4*2y = 0 

has an infinite number of solutions, namely, any pair such that y = — x. In the former 
case the matrix of the coefficients is non-singular, in the latter it is singular. This is a 
general rule: if the constants on the right are all zero, a necessary and sufficient con¬ 
dition for the existence of solutions different from 0 is that the determinant of the coeffi¬ 
cients is zero, and if the constants are not all zero and the determinant of the coefficients 
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is zero, then either the equations are inconsistent and have no solution, or they are 
consistent and have an infinite number. 

It is possible, moreover, for n equations to have a doubly infinite set of solutions, or 
indeed a set in which any number less than n of the unknowns can be assigned arbitrarily. 
To see how this arises it is convenient to introduce the idea of the rank of a matrix or a 
determinant. Consider the set of n equations 


a® = = 0. (1) 

There is clearly always one solution x k = 0 (k = 1,2, 

A determinant obtained from || a ik || by suppressing m rows and m columns is called 
a minor of order n — m\ || a^ k || itself may be called the minor of order n, and any single 
element is a minor of order 1. If m rows and columns are suppressed in such a way that 
the values of i for the rows are the same as those of k for the columns, the minor is a 
principal minor; a principal minor is symmetrically placed about the leading diagonal. 
The principal minor in the top left-hand corner is called a leading minor . Thus in the 
determinant of order 4 


«u 

a l2 

a 12 

rH 

$ 

°21 

a 22 

a 23 

a 24 

®31 

®32 

i 

! %3 

t 

a 34 

®41 

«42 

| a 43 

1 

«44 


the minor enclosed by dotted lines is simply a minor of order 2. That enclosed by broken 
lines is a principal minor, and that enclosed by continuous lines is the leading minor 
of order 2. 

Now it may happen that all the minors of order r +1 vanish (and therefore all those oi 
order greater than r-h1 do) while some of those of order r do not. The matrix and the 
determinant are then said to be of rank r. r is the largest integer for which it can be said 
‘not all minors of order r are zero’. In particular, a determinant of order n that vanishes, 
while not all its first minors (minors of order n — 1) vanish, is of rank n—\. If the deter¬ 
minant [| a ik || itself does not vanish, the matrix and the determinant are of rank n* 

We can see at once that unless a is of rank less than n the equations (1) have no solution 
other than® = 0. For if a is of rank n it has a reciprocal a -1 , and 

a~ 1 ax = 0 (2) 

implies x = 0 . ( 3 ) 

Conversely, suppose that a is of rank n- 1. This means that [| a ik || = 0, but not all the 
cofactors A ik are 0. For definiteness suppose that A nn , the leading first minor, is not zero. 
This can always be attained by rearranging the equations and renumbering the unknowns. 

* It is possible to extend the notion of rank tomxn matrices; for instance, 

x + y + z = 0, 2x+2y + 2z = 1, 

are inconsistent for all finite x, y, z. We need not consider such cases, which have little physical 
interest and greatly complicate the analysis. 
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Then a solution of equations (1) can be obtained by putting x n = 6, an arbitrary 
constant. For we can solve the first n — 1 equations for x 1 ,x 2 , ..., x n ^ x in the usual way, 
since the determinant of the coefficients of these unknowns is A nn , which by hypothesis 
is not zero. The solution is 

= ^ (t-1,2,(4) 


We have to verify that this solution also satisfies the nth equation. Substituting, we have 


V* A _ 

Q'nk'X'k ~ 2 ®nk^nk — ~T~ 


since the sum is the expansion of jj a ik || in terms of elements of the last row. But this 
is zero because || a ik || = 0, A nn =|= 0. Hence the nth. equation is also satisfied. Thus the 
ratios of the x i are unique, but the x i can all be multiplied by an arbitrary factor. 

If the cofactor A pq obtained by striking out the pth row and gth column is not zero 
we could similarly, putting x Q = c, obtain the solution 


But for (4) and (6) to be consistent, for all k , p, q, 

ApfrAnq A nk A pq = 0. (7) 

The vanishing of this expression when || a ik || vanishes is a special case of Jacobi’s theorem, 
which we shall prove later. It follows further that if a determinant and one first minor are 
zero, the minors either of all elements in the same row or of all in the same column are zero. 

Suppose now that the matrix a is of rank r. For definiteness suppose the equations 
arranged so that the leading minor of order r is not zero. We must suspend the summation 
convention for summations that are not from 1 to n. Then the equations can be written 


2 ^ik^k = “ 2 auX K (i = 1,2, ...,r), 

fc-l K~*r+l 


2 a ik x k = 0 

k=l 


(i = r+1, ...,n). 


For any set of values of x K (k = r +1 to n) the first set has a unique solution. If in fact we 
denote the determinant of the coefficients on the left of (8) by a, it is non-singular and has 
a reciprocal; and we denote the cofactor of a ik in it by a ik . Then for k = 1 to r, (8) are 
satisfied if, and only if, 


* * = - 2 2 a iK x K . 


Substitute in any of (9). We get for i > r 


2 ^ik x k 2 2 ^ &ik 2 a pK^K~^ 2 Q'iK'X'K 

k~l k^lp^l cc K-r+1 K~r+l 

= S 51““fc-S 2 a ik a pk a pK ). 

K=r +1 ^ \ p^l k~l ) 


(ii) 
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Consider the following minor of order r +1, and expand in terms of the last row and 
column: 

( 12 ) 


a ll 

a l2 

... a lr 

a i K 

a 21 

... 

. 


a rX 

a r2 

... a„ 



... 

... a ir 



The coefficient of a iK in its expansion is a. That of a ik a pK is that of a iK a pk with the sign 
reversed, and this is -a pk . Hence the coefficient of xja in (11) is a minor of order r +1 
of the original determinant, which vanishes by hypothesis. Thus if the matrix a is of rant 
r less than n, n—r of the unknowns can be assigned arbitrarily and the remainder are 
homogeneous linear functions of them. 


4*06. Determinantal equations. We often have to consider a set of n equations 

ik x k = b ik x k , ( 1 ) 

where A must be determined in such a way that solutions exist with some of the x k 
different from 0. This requires that A must be equal to some root of the equation 

1 A«f* — b ik |j = 0. (2) 

This will in general have n distinct roots. For each root A } a set of values of the x t , say l ijt 
can be found to satisfy (1), and any multiple of this set will be a solution. For some 
common types of matrices, notably symmetrical and unitary ones, the solution has 
special properties, which we shall study later. 

If a ik = 8 ik , the left of (1) reduces to Ax t . Then (1) can be written in matrix form 

A* = bx, ( 3 ) 

and the A are variously called the latent roots, characteristic values, proper values, and 
eigenvalues of the matrix 6.* The equation determining them 

i^fc-A^/J = 0, (4) 

is called the characteristic equation of the matrix. A set of x it say l i} , satisfying (3) for a 
particular value of A, say A^, may be called a characteristic solution or eigensolution. 

If we take a general set of numbers X, it may always be possible to find a set so that 

= (5) 

The condition for this is that the matrix l i} shall be non-singular. This is easily proved to 
be satisfied when all the A ; - are different. For if l tj was singular we could take non-zero 
values of £,• so that 

S hj£j = 0 (6) 

for all k. ' 


* Both A and A 1 sometimes occur in actual problems. Hilbert and Courant (Methoden d. Math 
Phys. 1, 13) use characteristic value for A defined by (3) and eigenvalue for 1/A. 
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Let r be the smallest number of the ^ represented in any such set of equations. Then 
we can arrange so that j runs from 1 to r. Then for all i 

0 = 2 (?) 
j=*i 

But if we take l to be any j represented in (6) the corresponding factor A^ — A z = 0, and we 
have a set of relations for all i between at most r — 1 of the since by hypothesis the A^ 
are all different and the terms in (7) cannot all vanish. Thus the hypothesis that ly is 
singular leads to a contradiction. 

l tj is also often non-singular, but not always, when some of the roots are equal. Consider 

j. Each has a repeated root 1. For the first, two characteristic 

solutions are (1,0), (0,1). The only characteristic solution of the second is (1,0). 

Consider the matrix A^ whose diagonal elements are the A^ and all others zero. Then 
we can write (3) as 

bikhj ~ kk^w ( 8 ) 

for all i, j ; and, if I is non-singular, multiplying before with I" 4 we have 

i-m = x. (9) 

Thus a transformation of the form (9) reduces b to diagonal form. Applied to the unit 
matrix it gives I -1 II, which is 1. Hence such a transformation leaves the unit matrix 
unaltered: it is called a collineatory transformation and two matrices connected by such 
a transformation are said to be similar . It is not necessary to specify that I is non-singular; 
this is :aken as understood from the fact that I -1 appears in the formula—if l was singular 
there yould be no such formula. 

4*061. Importance of diagonal matrices. All diagonal matrices commute with 
one another. If the diagonal elements of a diagonal matrix A are A,, those of A n are A? 
and all others zero. If no A r is zero the diagonal elements of A -1 are A^ 1 and all others are 
zero. If Hoi = A, then a n = (IAI“ x ) n = IA n l -1 , all intermediate II -1 cancelling when the 
expression is written out in full. 

This result is connected directly with the behaviour of dynamical systems satisfying 
linear equations of motion. If the coordinates and velocities are given at time 0, those 
at time t are linear functions of them. If then we denote their values at times 0 and r 
by and Y i (i — 1 to 2n, where n is the number of degrees of freedom), then 

I = a ik X ki Y ~ aX. 

Now if 4*06 (5) is satisfied by characteristic solutions of the matrix equations ax = Ax, 
we can write 

Xk = h&> * = I5; ? = 

and | Y i = a ik l k j = lik^ k j£j> f = IAHAX. 

If now Z i are the values at time 2r they are found from Z = aY — a 2 X — IA 2 I“ X X. The 
process can be repeated to any multiple of r, and therefore so long as I is not singular the 
solution is found. Evidently the A jt (j = l) are the time factors exp (yr) in the solution 
of the dynamical equations by the usual method. 


the matrices 


l 10 ) l 11 
\ 01 /’ \01 




4*062-4*063 Diagonal matrices 129 

Since in general the roots are different I is usually not singular; in some important 
special cases I is not singular even if some of the roots are equal, and the same method 
can still be used. 

4*062. A matrix satisfies its own characteristic equation. If the equation 

l|a«-A*«|| = 0 (1) 

is expanded in powers of A, we get an equation of degree n in A: 

D( A) = a w +a n _ 1 A + a n _ 2 A 2 + ... + (-!)” A n = 0. (2) 

The theorem means that if we substitute the matrix a for A in each term and interpret by the rules of 
matrix multiplication and addition, we get a matrix D(a) every element of which is 0. 

We assume first that the equation (2) has n distinct roots. For each root A y a set of non-zero x u 
exists satisfying 


A/#*/ — a ik x kis 

aXj — x s \ j9 

(3) 


a r Xj — XjX*, 

W 

since A* is a number. Then take y t — x ih 

y = 

(6) 


where /?* are n arbitrary multipliers. Then 

D(a) y = (a n 1 + cc n _ x a + a n _ 2 o 2 + ... + (- l) n a") ft f x, 

= S(a n + a»_iA^-f-a B _ 2 AJ*f... + (— l) n Af)/? i a5 / 
i 

= 0 , ( 6 ) 

since the A* all satisfy (2). But the matrix of x u is non-singular, and the fi f can be chosen to make y { 
anything we like; and therefore 

£>(«) = 0. (7) 

If D(A) = 0 has multiple roots we can suppose the elements altered so as to make the roots distinct, 
and then apply (7). Then if the elements are made to approach their original values continuously 
each element of D(a) also approaches its original value continuously. But it is always 0; hence the 
original value of each element must also be 0. 

Shorter proofs exist; this one is given to illustrate the use of diagonal matrices. 


4*063. Block matrices. Suppose that the elements of a matrix are grouped into blocks 
(squares and rectangles). These blocks themselves can be regarded as elements of a matrix 
of lower order. Thus from the matrix 


where 



(1) 

( 2 ) 


We shall speak of the matrix on the right in (1) as in block form and denote it by A. That on 
the left will be said to be in expanded form . Two matrices A, B in block form may be 
multiplied to give another matrix AB = C in block form by the usual rule, 

@ik — 'LA ij B jk , (3) 

3 


provided that (1) the number of block columns of A is equal to the number of block rows 
of B, (2) the number of columns of the matrix A {j is equal to the number of rows of the 


JMP 
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matrix B jk . For instance, if A, B are built up as follows, the marginal entries indicating 
the numbers of rows and columns, m, n, p of the components, 


«1 

n 2 

Pi i>2 

Pz 


A = /A xx 

^i*\ m v 

B = fB lx B 12 

•®13\ ,l l’ 

(4) 

U21 

Aj m 2 

\B 2 i B 22 

Bj n 2 


where A n , A 12 , A. n , A 22 are respectively m 1 
are n x xp x , n x xp 2 , ..., the product matrix 

x n v m x x n 2> m 2 
is 

x n x , m 2 x n 2 and B xx , B 12 , 

*•*> 


Pi 

i>2 P 3 




c = ic xx 

^12 ^13\ m i> 


(5) 


\Cn 

@22 @2<J ^2 




where C n , C 12 , ... are vn^xp^ m x xp 2 ,.... 

If a, b are simple matrices and ab = c, then if A, B are block forms of o, 6, C is a block 
form of c. In other words, if we form C we can derive c by writing out all the elements of 
the matrix elements of C in full and removing internal brackets. We prove this for the case 
where A ll9 B X1 are n x x n v A&, B 12 are x n 2 , A 2X > B ^ are n 2 x 7^, and A 22 , B 22 are n 2 x r 
It is convenient in A 12 , A 22 to number the columns n x + \ to n x + n 2i and in A 21 , A 22 to 
number the rows n x + 1 to n x + n 2 . Then 


a ik — C^ll )ik (* ^ n x> & ^ n i)> 

a ik = (A 12 ) ik (i < n v n x +1 < k^ n x -f n 2 ), ^ 

a ik = (A 21 )i k ( 7i ^ ^ i ^ % + n 2t k < nf) 9 

a ik = (A 2 ^ ik (n x -|-1 < i < n x + n 2 , n x +1 ^ k ^ n x + n 2 ). J 
The same rules are applied to 6. Now 

/ ni tu+n, \ 

c ik “ ( ab )ift = ( S + £ ) ( 7 ) 

\i-i y“+ i/ 

For i,k^n x , this is 

S(^u)«(*ll)*+ S (-^12)0 (® 2 l)j* — + = 

i=n x +l 

The proof is similar for the other parts of the product. 

We shall speak of a matrix as in diagonal block form if it is in block form and all the 
matrix elements other than the diagonal ones are null. The expanded matrix found by 
writing out in f ull and removing brackets will then have the property that all its non-zero 
elements are in non-overlapping squares symmetrical about the leading diagonal. A 
diagonal matrix is a special case. 

If A , B are in diagonal block form it is clear that the product AB is also in diagonal 
block form. 


4*064. If A is a diagonal block matrix whose diagonal elements are A r8 (s = r) and its 
expanded matrix is a, a necessary and sufficient condition that a shall be reducible to diagonal 
form by a collineatory transformation is that all the A rs shall be so reducible / and the 
characteristic solutions of a are obtained from those of the A rs by including zero components . 
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We take the case where A is 2 x 2, since the extension to the case where it is n x n 
follows by repetition. The statement that A n is reducible to diagonal form means that 
a non-singular L x exists such that L~ l A xx L x = X x> where X x is diagonal. Then if A n is 
similarly reducible to X 2 by L 2 , 


/LT 1 0 \/A xx 0W L x OX/Lr'Anl* 0 \ _ 0\ 

\o L^]\ o aJ\o lJ~\ o L 2 t A 22 Lj~\ o xj (1) 


The expanded matrix formed from 


singular, and 


(Lr 1 0 
\ 0 L a 



is non-singular because L x> L s are non- 



Hence the condition is sufficient. 

Conversely, let a be reducible to diagonal form by a collineatory transformation. Let 
A n , A 2i be n x x n x and n 2 x ra a respectively. Then the statement that a is reducible to 
diagonal form by a transformation l-'al means that 74 +» a linearly independent sets 
= 1 to n x +n 2 ) exist such that for each set there is a A* satisfying 


a uhod — ( 3 ) 

where the brackets round the suffix mean that we are not summing with regard to k. 
The Aj. need not all be different, but the statement that is diagonal implies that even 
if some of the X k are equal corresponding solutions can still be chosen so as to be linearly 
independent. 

We note first that 

det (o - A 1 ) = det (. 4 U - A 1 ) det (A 22 - A1). (4) 

Therefore every root of a is one of either A u or A 22 ; and if A is a p-fold root of A u and a 
tf-fold one of A 22 it is a (p +7)-fold one of o. 

We have to show that, in the given conditions, A n and A 22 have n x and n 2 linearly 
independent characteristic solutions respectively. (Neither, of course, can have more.) 
We number the rows and columns in A xx from 1 to n x , and those in A^ from 74+ 1 to 
n i + n 2- If A(fc)x ^ a characteristic solution of A xx , so that 

(•^11)0 ^jOch — ^Aoai> ( 5 ) 

then if k(k)i = (i »i), 

= 0, (%+l<i<% + » 2 ), 

we have, for i ^ n x , o, {} l j(k ) X = (A lx ) i} L jlJc)x = AL i0i . )1 = AZ i0fe)1 , (7) 

and for t > n x , a {j l m x = 0 L m x + (A 22 ) if 0 = A l m x , (8) 

Hence l mx is a characteristic solution of o with the same A. Similarly, if L l(h)2 is a charac¬ 
teristic solution of - 4 aa , with i > n x + 1, there is a characteristic solution Z i(A .) 2 of o, given by 
taking 

h(k)z — 0 (i<n x ); l i(k)2 = L l(k)2 (n x +1 < i < n x + n 2 ), (9) 

Further, if the solutions taken for A 1X , A 22 respectively are linearly independent, the l {(k) 
derived from them are also linearly independent. 



9-2 
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Suppose now that there are m 1 linearly independent characteristic solutions of ^4 n and 
m 2 of A 2 2 , where 

m x ^ n x> ^ Wgj but wij "f“ m 2 ^ n x -|- n 2 * 

Then we can derive by the method just given fewer than n x + n 2 characteristic solutions 
of o; hence there is at least one other characteristic solution of a, say ^( mi+Wa+ 1 ), satisfying, 
for some A, 

a ij ^(wii+ma+l) = (1®) 

But this is equivalent to the two sets of equations 

(^11 )ij = ^i(m 1 +m i +X>, (* ^ n l) 

(^22)i^i(m 1 -Hn 2 +l> = ^(mi+ma+l) ( n l + 1 ^ i < + n 2 ). 

The by hypothesis are not all zero; hence the first set give a characteristic solution of 
A xli or the second one of -4 2 2> or both statements are true. In any case we can write 

^(mi+ma+l) 5=5 "t* ^i(mi+m2+l)2* (1^) 

But by hypothesis we have already found all the characteristic solutions of A n , A 22 ; 
hence the Z i ( mi+mg+1)1 can be expressed linearly in terms of the % known solutions for A ll9 
and the li( mi+mt + 1)2 hi terms of the m 2 known solutions for ^4 22 . But then it follows that 
h(m 1 +m t + 1 ) can be expressed linearly in terms of the solutions already obtained; this 
contradicts the fact that it is a new independent solution. Hence m 1 = n v m 2 — n 2) and 
the theorem is proved. In particular, if A is a p-fold root of A xx and a g-fold root of A 22 , 
this process yields p + q independent characteristic solutions of a corresponding to 
that A. 

If lifa) is any characteristic solution of a, the method used to get (11) will give a charac¬ 
teristic solution of either A xx or ^4 22 or both. If the associated A is a root of A xx but not of 
A 22 , it will yield a solution of A X1 , but a zero solution of ^4 22 . 

(L x 0 \ (L x 0\/l 0 \ 

We notice also that I I = I ) I ), (13) 

\ 0 Lj \0 l/\0 Lj 

and therefore the reduction of a can be carried out in stages by transforming the blocks 
to diagonal form in turn. 

4*065. If two matrices are both reducible to diagonal form by collineatory transformations, 
a necessary and sufficient condition that they shall be reducible by the same transformation is 
that they shall commute . 

If l x al = A, l~ x bl = p., 

where X, p. are diagonal and therefore commute, 

ab = M = IXpi 1 = IpAI— 1 = ba. 

Hence the condition is necessary. 

Conversely, if l~ x al — X, l~ x bl = c , 

where X is diagonal but c is not assumed to be so, and ab — ba , 

Xc = l^ali m = I ^abl = l x bal = cX. 
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( C ^)zA “ c ik^kk> 


where summation is not understood for k . Hence either c ik = 0 or \ u = A**. If all diagonal 
elements of X are different, it follows that all non-diagonal elements of c are zero and c is 
in diagonal form. If some diagonal elements of X are equal, c can be arranged in diagonal 
block form, the diagonal of each block corresponding to a set of equal diagonal components 
of X. But by hypothesis c is reducible to diagonal form; hence by 4-064 the reduction can 
be carried out by reducing each block in turn, and such a transformation does not affect 
the corresponding elements of X because they form a multiple of a unit matrix. 

The following proof of sufficiency was communicated to us by Professor P. Hall; it does not 
depend on the properties of block matrices. If o, b are both n x n, the statement that they can 
be reduced to diagonal form is equivalent to the statement that each has n linearly independent 
characteristic solutions; and we have to show that if ab = ha there are n linearly independent 
solutions characteristic of both a and 6. Any set of n quantities can be expressed as a linear 
combination of the characteristic solutions of a, since these are linearly independent; in particular, 
any characteristic solution x i of b corresponding to the root ft can be so expressed, say, 

r 

k=l 

We may assume that ai(*) = A^j !(*> with different A*, since if two characteristic solutions of a correspond 
to^the same A, any linear combination of them is also a characteristic solution corresponding to that A. 

r 

a ii^fk x k — a al lx i = 

k=1 
r 

b ij a jk x k — b if 2 u A^*)^*), 
fc=l 
r 

and since ab = ba it follows that 2 is also a characteristic solution of b corresponding to the 

root fi. Hence so is 

r 

^l X i 2 \k)^i(k) = (Ax — A a ) 2) + (Ax — A 3 ) l# 3) + .... 

k= 1 

By repeating the argument we show that (A x -A r ) (A 2 -A r )... (A,_x- A r ) l i(r) is a characteristic solution 
of b and therefore l i{r ) is one. We conclude, since the ordering of A; is irrelevant, that every 1## is a 
characteristic solution of 6 as well as of a. Hence every characteristic solution of b can be expressed 
as a linear combination of simultaneous characteristic solutions of a and 6, and therefore any set of 
n numbers can be so expressed. 

4*066. The theorems of 4*064,4-065 are of importance in quantum theory in the special 
case where a is hermitian . It will be shown in 4-083 that if a is hermitian it can always be 
reduced to diagonal form by a collineatory transformation, with I unitary. Then if a is 
hermitian and in diagonal block form the diagonal blocks are also hermitian and can be 
reduced similarly. Hence we do not need the general theorem of 4-064, but we do need 
the property expressed by 4-064(13). 

4»067. The following numerical example occurs in determining the eigenvalues of the angular 
momenta for a certain configuration of electrons in an atom (cf. Condon and Shortley, Theory of Atomic 
Spectra , C.U.P. (1935), pp. 222-6). 
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Consider the three matrices: 



4*067 


All three matrices are symmetrical and all commute. The roots of A are } (four times), . Those 
of B are 2, 2, 6, 6, 0. As the matrices are symmetrical the reduction can be performed for each by 
an orthogonal transformation (cf. 4*081). We find, starting with B, that the orthogonal matrix 


gives 



I 3 

0 

0 

0 

0\ , l-'Bl = 

2 

0 

0 

0 

°\ 

0 

3 

0 

0 

°\ 

0 

2 

0 

0 

°\ 

r 

0 

3 

0 

0 

0 

0 

6 

0 

0 

1° 

0 

0 

3 

0 

i° 

0 

0 

6 

0 

\o 

0 

0 

0 

15/ 

\o 

0 

0 

0 

0/ 
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The first two matrices are in diagonal form, the third in diagonal block form. The separate blocks can 
now be transformed to diagonal form without disturbing l~ x Al and This further transformation 

m, where 


tn = 


1 


V® 

Vio 

O 

i 

0 0 

„ 

0 

V® -3 

3 V 6 

0 

, 0 

0 i 

i V 1 ®, 


gives (ImJ^Cflm) = £ jl 0 0 0 0\ 

0-200 0 
0 0 2 0 0 

00 0-30 

,0 0 0 0 0/ 

4*07. Minors of the matrix of cofactors: Jacobi’s theorem. Let obeamatrixof the 
nth order; denote its determinant by D. If M is a minor of order k, then the (ft - &)-rowed 
minor obtained by striking out from D all the rows and columns represented in is 
called the complement of M. In particular, if M is a single element a ik its complementary 
minor is the corresponding first minor, but in this case it is often more convenient to use 
the signed minor or cofactor A ik . Similarly, it is convenient to define the signed com¬ 
plement of M as follows. If M is an r-rowed minor of D, in which the rows i l9 i 2 , .. i r are 
represented, and the columns k l9 fc 2 , k ri the signed complement of M is defined by 

Signed complement of M = ( - 1 )*i+*»+ • • • +4+*!+**+ • • * +*r (complement of M). 

For a principal minor i x = k l9 ...,i r = k r , and the signed complement is the same as the 
complement. The signed complement of the determinant itself is defined to be 1. 

The matrix whose elements are A ik will be called the matrix of cofactors. The adjugate 
matrix (4*01 (21)) is its transpose. Consider the determinant 

( 1 ) 


^11 

A 12 

.. A ln 

-^21 

A 22 •< 

* • ^2 n 

^»i 

... •. 



Then we can prove the following theorem about the corresponding minors of D and A. 
If M and M ' are corresponding r-rowed minors of D and A, then 


M f = .D r “ 1 (signed complement of M )• 
In particular, if r = ft, A = D n_1 . 

If r = ft — 1 , then if a ik is the cofactor of A ik in A, we have 


( 2 ) 

( 3 ) 
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If r = 2, suppose that M and M f are obtained by striking out all but the i and m rows 
and the k and p columns. Then 


A jjr 

A A 

"mk ^mp 


= D (signed complement of M). 


( 5 ) 


We prove the general theorem first for the special case where M and M' are leading 
minors and D + 0. Then we can write 

M’ = 


Multiplying by 


we have 



A 21 


A r i 0 

0 

... 0 

(6) 

A 12 

•• 



A r2 0 

0 

... 0 


A 

1 r 

.. 

. 


A„ 0 

0 

... 0 


■^l,r+1 

• • 

. 

A r ,r+1 * 

0 

... c 



•• 

• 

•• 

. 


. 

1 

. 

• 


^1,71 

• • 

. 

Kn 0 

0 

... 1 



D 

= 

°11 

a l2 

... a ln 

> 



(7) 




a 21 

... 

... a 2n 









... 

• * * a nn 





= 

D 

0 ... 

0 

a l,r+l 

... 

a l,n 

• 

(8) 


0 

D ... 

0 

... 

... 

... 




... 


. ... 

D 

a r,r+l 

... 

®r,» 




0 

•• 

. ... 

0 

a r+l,r+l 

... 

... 




0 

, , 

• ... 

0 

®n,r +1 

... 

^nn 




= IF x complement of M in D, 


which proves the theorem. It should be noticed that with our definition of the comple¬ 
ment of D itself this proof holds for r — n. 

If D — 0 it follows from 4-05 (7) that the minors of the elements of any two rows (or 
columns) of D are proportional and therefore M r = 0 for r ^ 2, and the theorem still holds. 
If D = 0 and r = 1, D** -1 is undefined, but M' — A 1X = complement of M in D. 

The proof for the general case when M is not a leading minor is done by rearranging 
both determinants so as to bring M into the leading position and studying the changes of 
sign involved. Details of the proof will be found in M. Boeher, Introduction to Higher 
Algebra, p. 32. 

4*08. Quadratic and hermitian forms. A function of n quantities x i of the form 
a ik x i x k a ik real is called a quadratic form. If y i are another set of quantities a ik Xiy k 
is called a bilinear form. In matrix notation they may be written xax and dbay. a^ can be 
taken to be a symmetrical matrix in the former case. For if a 1 ^x 1 x 2 is one term, a 21 x 2 Xj 


















4*08 Quadratic and hermitian forms 137 

is another, and their sum is unaltered if we replace both a 12 and by £(a 12 + a 21 ); an d 
similarly for any pair of values of i and k. 

By a change of variables a quadratic form can always be reduced to a sum of squares 
(not necessarily with positive coefficients). A simple and very useful method is to take 


^ 1 = * 1+ ^ a!a+ ... + ^a 

Oil «11 


( 1 ) 


as a new variable. Then if we subtract a n £f from a ik x i x k all terms containing x x cancel. 
Then by introducing a new variable we can similarly remove all terms in x 2 ; and in 
general we shall reduce the quadratic to the sum of n squares. Then 


Oil*!+ 2a 12 x t x 2 +a zi x \+... = % £\+p£\+... +/? n £|, 


( 2 ) 


where each £ r starts with x r . The method fails if the form contains no square terms 
originally, for then there is no starting-point; but a simple change of variable will intro¬ 
duce them. Thus if a ik = 0 when i = k, but a 12 + 0, put x 2 = x 1 + ^ 2 ; then x 1 x 2 =* x\ + x x ^ 2> 
and we can proceed. 

The features to be noticed* are 

(1) The product of the coefficients on the right of (2), up to fi r , is the determinant 
of the coefficients on the left up to that of 

(2) The reduction to sums of squares can be done in an infinite number of ways, of 
which this is only one; but however it is done the numbers of positive, negative, and zero 
coefficients are the same so long as the transformation is non-singular; that is, if the old 
variables can be expressed uniquely in terms of the new ones and vice versa. 

(3) a ik x i x k is called a positive definite form if it is in general positive and can be zero 
only if all the are zero; and a set of necessary and sufficient conditions for it to be 
positive definite is 


a n >0, 


a \l 

a i2 

>o, 

°ii 

<h a 

a i3 

a 21 

a 22 


«21 

a 22 

a 23 




®31 

a 32 

a 33 


> 0 , 


a ik II > 0- 


(3) 


A quadratic form can be essentially positive without being positive definite; for 
instance, (x — 2\y) 2 = 0 for x = 2, y « 1. For such a form || a ik |j = 0. 

a ik x i x k is negative definite if —a^x^ is positive definite. 

(4) If all these determinants of orders > r vanish, but that of order r does not, the form 
is reducible to r squares and is said to be of rank r . The matrix a ik is then also of rank r. 

If a ik is a hermitian matrix, a ik x { x k is called a hermitian form. It is real; for if we take 
a particular term a ik x t x*, another particular term is got by interchanging the values of 
i and k and is therefore a ki x k x *. But a ki = a* k for a hermitian matrix, and therefore 
these two terms are conjugate complexes and their sum is real. Similarly a ik x { y^ is called 
a hermitian bilinear form. The theory of hermitian forms is almost identical with that of 
quadratic forms, the only change being that in place of (2) we shall have 


«***<«* = + 


* 

<n> 


( 4 ) 


* Proofs will be found in the standard books, e.g. W. L. Ferrar’s Algebra, Chapters 10 and 11. 
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where the coefficients on the right are all real; and the form will be positive definite under 
the same conditions as (3). The determinants in (3) are all real; for they are unaltered by 
transposing rows and columns; but this replaces i by — i everywhere, and the imaginary 
part would be reversed if any determinant was complex. 

4*081. Reduction of pair of hermitian forms. Any pair of real quadratic forms 
a ik x i x k> ^ik x i x k can i 11 general be reduced simultaneously to sums of squares by trans¬ 
formation of variables; this is always true if one of the forms is positive definite. Any pair 
of hermitian forms can similarly be reduced simultaneously to the form (4) under the same 
conditions. The analysis for the two cases is practically the same; we shall take the hermi¬ 
tian case since it includes the other, and has applications in quantum theory. The case 
where a ik is symmetrical presents itself in almost every branch of physics. The simplest 
is the one we have already discussed, the reduction of a second-order symmetrical tensor, 
with n = 3, to principal axes, which amounts to taking a ik = $ik- But without this con¬ 
dition we could reduce two forms in three variables to sums of squares simultaneously by 
a linear change of variables; geometrically this implies that any two concentric quadric 
surfaces have a set of three mutually conjugate diameters in common. The case of n 
variables arises in the theory of small oscillations in dynamics, where it is clear that if we 
can reduce the kinetic and potential energies simultaneously to the forms 

2T — m 1 + m 2 

2V = m 1 \ 1 £l+m 2 A 2 !'l+ ...+m n A n ££, 
each variable will satisfy a differential equation 

ir = ~KZr, 

and their variations with time are therefore independent. The condition for stability is 
therefore that all the A,, shall be positive. Since T is positive definite all the m r are positive, 
and therefore V also must be positive definite for stability. 

It is convenient to start with the set of n equations 

^ a ik x k ~ bik x k • (1) 

Then the general method of solution leads to the equation for A 

= 0 . ( 2 ) 

For a general real A, || Xa ik — b ik || is hermitian and therefore real. Hence all the coeffi¬ 
cients of A in its expansion in powers of A are real. We shall assume that is positive 

definite. 

Let A x and A 2 be two roots of (2), and let x iv x i2 be corresponding non-zero values of 
Then 

ik x ki = hk x k\> (3) 

^2 a ik X k2 = bik x k2- (4) 

Multiply (3) by x* 2 and add; multiply (4) by x* x and add. Then 

^ l a ik x kl x i2 = bik x kl x i2> (5) 

A 2 a ik x k2 x il = bik x k2 x il* (6) 
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Taking the conjugate of (5) we have 

Af (7) 

But dik x kl x i 2 ~ a ki x i 2 x *l ~ a ik x k 2 X tl (®) 

by interchange of dummy suffixes. Similarly 

b ik x kl x i2 ~ b ik X k2 X il ( 9 ) 

and therefore, by comparing (8) and (9) with (6), 

(A 2 -Aj)a^z* 2 4i - 0. (10) 


First let the two solutions compared be the same; then a ik x kl x *i is real. If it was zero 
b ik x k i x a would also be zero by (5). This is impossible if either of these expressions is a 
positive definite form. It then follows that A x = A* and therefore A x is real. Hence all the 
roots of (2) are real, and for real forms the ratios of the x n are also real. 

Next let A x and A 2 be different. Then since they are real Ajj + A*, and therefore 

a ik x k2 x tl = 0, b ik x k2 x il = (H) 

We shall call this property of the solutions orthogonality with respect to a and b and 
denote it by “orthogonal a”. 

At each simple root of (2) the determinant has a non-zero derivative with regard to A. 
But this derivative is a linear function of first minors of the determinant, and therefore 
not all of these can be zero. It follows that at a simple root of (2) the rank of the matrix 

C\ k = Aa ifc — b ik 

is n— 1. At an r-fold root the rth derivative is not zero and is a linear function of rth minors, 
and therefore the rank of c ik is at least n—r. We shall show that it is actually equal to 
n—r, but this requires the rather intricate argument of the theorem of the separation of 
the roots. 

We first take the case where all the roots are simple. Then for any root, say A x , the 
ratios of the x i are unique and we can write 

x a “ hi £i- (12) 

Evidently A x = b ik l k - x , (13) 

^l a ikhl^il = b ikhl^il' ( 14 ) 

a ikhfn is real and not zero. Also if A x and A 2 are different, by (11) 

a ikhl^i2 = b iklki$2 = 0- (1^) 

If we denote the various solutions by the suffix j, we can regard l tj as a transforming matrix: 

ltj a ik^ki = ( Val)ji . ( 16 ) 

It follows from (15) that all non-diagonal components of Pal and Pbl are zero; hence this 
transformation reduces both matrices to diagonal form. If we denote the diagonal terms 
by and B ft {l =j), ia equal to the corresponding value of A, by (14). Again, 

U Ai II II II Pkl II = II Ai a ikhl II ’ (17) 
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which is the product of diagonal components, none of which is zero. Hence I and It are 
non-singular matrices, and if we now write in general 

— ( 18 ) 

it will be possible to solve and find the £, for any assigned values of the X*. Then 

a ik^k^i = a ikhl£llij€* 

= 4l£l& (19) 

b ik X k Xt = ( 20 ) 

Thus all terms with j + Z are absent from both forms. 

An important special case is where a ik = S ik , so that the original equations reduce to 

= b ik x k . 

In this case a^ k X i ZX = | X i j 2 +... +1 X\ |, 

and is a positive definite form, and 

Aji = ~ tfihi 

must be in diagonal form. But for any solution A^ the l tj are arbitrary by a constant factor. 
Choose this so that 

l%hi (j — Q* 

In any case = 0 (J + i); hence 

= $ ik , 

and the transformation does not alter the unit matrix; hence l is unitary. Hence any 
hermitian matrix 6 with no repeated root can be reduced to diagonal form by a transforma¬ 
tion 1*61, where I is unitary; and the transformation will reduce b ik X i X% to 

2A jXjX*. If 6 is a real symmetrical matrix I will be a real unitary matrix, that is, an 
orthogonal matrix. 

4-082. The theorem of the separation of the roots. This resembles Sturm’s theorem 
in the theory of equations. We still take a, b hermitian nxn, and obax positive definite, 
and write 

^ a ik~^ik — c ik> Aa — 6 = c. ( 1 ) 

We denote || c ik || by D. We have already seen that when A is any root of D = 0, D cannot 
be of higher rank than n— 1, and that if A is an r-fold root D cannot be of lower rank than 
n — r. We propose to show that the rank is actually equal to n — r. 

Let the leading minors of D of orders n,n- 1, ..., 1, 0 be denoted by 

D n (= D) 9 D n —i> •••> ^i( = c ii)j A>( = 1)* 

We assume first that D has no repeated root and that no two consecutive D r , D r _ x vanish 
for the same value of A. We consider for any value of A the changes of sign as we go along 
the finite sequence D n to D 0 . 
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4-082 Root separation theorem 

From Jacobi’s theorem we have, if C ik is the cofactor of c ik in Z>, 


DD„* — 


@n- 1> ^n-1 ^n-X, n 

ft ft 

^n, n~l v nn 


— D n —\ l, n—1 | ^n~l,n | 2 * 


Similarly, if G\ k denotes the cofactor of c ik in D 8 , 

AA-2 - Rs-1 CUs-1 -1 I 2 - 


( 2 ) 

(3) 


If A is a root of D 8 _ x , then since neither D 8 nor D 8 _ 2 is zero for that value of A, D 8 D 8 _ 2 is not 
zero, and must be negative. Hence when D 8 _ x = 0, D s and D 8 _ 2 have opposite signs. Now 
since is positive definite we see that as A -> + oo the signs of the D 8 are all positive. 

When A-> — oo they are alternately positive and negative. Hence as A increases from 
— oo to -f-oo, n changes of sign are lost in the sequence. We can see, however, that the 
number of changes of sign is unaffected as A varies, except by changes of sign of D itself. 
Eor if A increases through a zero of D 8 _ Xi D 8 and D 8 _ % have opposite signs, and in any case 
there is one change of sign in the sequence D 8 , D 8 _ l9 D 8 _ 2 , and there is no alteration in the 
total number of changes of sign along the sequence D n toD 0 . ButD 0 = 1 and never changes 
sign; therefore all the n changes of sign lost are lost at the beginning, that is, by A passing 
through n real zeros of D. 

The figure shows how the roots of the sequence 
of equations D n = 0, D n _ x = 0, ..., must lie 
relative to one another. The graph of D a lies 
parallel to the A axis. That of D x is a straight 
line passing to -f oo at A = -f- oo and to — oo at 
A = —oo. D x = 0 has one real root. 

D* ->-f-oo as A->±oo, and when D x = 0 it is 
negative. Therefore D 2 = 0 has two real roots, 
one less and one greater than the root of D x = 0. 

Z> 3 -> — oo as A->—oo and to +oo as A-> + oo. 

When D 2 = 0 and D x < 0, D s must be > 0; when 
B 2 = 0 and D x > 0 , Z> 3 < 0 . Therefore the roots 
of Z> 2 — 0 separate those of D 3 = 0. Similarly, 
those of D 8 ~ 0 separate those of D s+1 = 0. This proves the theorem of the separation 
of the roots when those of D n = 0 are all different. 

If D n has an r-fold root, A = h p , by making small changes in the b ik) to b' ik say, we can 
alter the equation so that all roots become distinct. If D 8 and D 8 _ x vanish for the same A, 
it will be possible to alter D 8 by changing, say, 6 ls without altering D s _ x . Hence if the 
conditions postulated are not satisfied we can make small alterations so that they are 
satisfied. Consider the group of r roots that coalesce at X p when b' ik ^b ik . Between them 
there are r — 1 zeros of D n _ x , r — 2 of D n _ 2 , ... 1 of D n _ r+1 , none of D n _ r . 

Since we could arrange the determinant with any diagonal term in the top left corner 
by an interchange of two rows followed by one of two columns, without disturbing its 
hermitian form, it follows that if A^ is an r-fold root of D = 0, all the principal minors of 
orders from n down to n — r+ 1 vanish for A = A p . The principal minors of order n — r 
cannot all vanish because then A^ would be an (r-h l)-fold root of D == 0. 

We have seen that if a determinant and a first minor vanish, then the first minors 
either of all elements in the same row or of all in the same column vanish. Consider 
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then any principal minor of order n-r+2. This and any first principal minor in it vanish 
for A = Aj,, the latter being a principal minor of order n — r+1 of D. Let the row and 
column omitted be the ith. Then either all the minors of c ik or of c ki vanish. But these 
are conjugate complexes, and if one set do the others do. Hence all minors of order 
n —r +1 of D vanish, and when A = A p , c ik is a matrix of rank n—r, as was to be proved. 

Again, A = Ap is at least a simple zero of all minors of order n — r+1 and higher orders. 
But the derivative of any minor is a linear function of those of the next lower order. 
Hence the minors of order n—r+2 have double zeros and those of order n — 1, that is, 
the first minors of D, have at least (r — l)-fold zeros. We shall meet this result again when 
we come to the operational treatment of small oscillations, f 

4*083. Extension of the orthogonal property. We have proved this property for the 
hi associated with any pair of different roots of the determinantal equation. If A = Ay is 
an r-fold root, then since the matrix c ik is of rank n — r for A = Ay we can choose r linearly 
independent sets of ratios of the Let two of these be l ix and l i2 . Then 

^i a ikh i = buchi* ^j a ikhi = b ik l k 2> (1) 

and therefore Ay a i1c l k2 l £ = b ik l k2 l (2) 

These expressions may be zero; if so, the solutions are already orthogonal o. If not, 
consider 

m <2 — hi ~~ @hi* (3) 

where 6 is independent of i. The m i2 are not all zero, since l ix and l i2 are not proportional. 
Then for any d 

^ i a ik m k 2 ~ b ik m k2> ( 4 ) 

a ik m k2 l*i = a ik hi la — ® a ik hi fii> (&) 

which can be made zero by a suitable choice of d\ and then it follows from (2) that b ik m k2 l* x 
is also zero. 

If Ay is an r-fold root we can choose one set of l tj for it and then make all the others 
orthogonal a to that set by subtracting suitable multiples of it. Choosing one of the 
modified sets we can make all the others orthogonal a to that one, and proceed until we 
have made all the r sets mutually orthogonal a. Thus even if the equation for A has 
multiple roots we can still choose n sets of l {i with the orthogonality property, and the 
reduction of a ik and b ik to diagonal form can still be carried out (in fact in infinitely many 
ways). If = S ik> l can be taken unitary as before. 

If the normal coordinate corresponding to a particular solution is required we can 
proceed as follows. Let x i = hi^i be the solution. Take a general set of x i and determine 
£i so as to make a ik (x k — l kl £ x ) (x* — &£?) stationary for variations of £ x . Both real and 
imaginary parts may be varied, but the derivatives with regard to them can both vanish 
if and only if the derivatives with regard to and treated as independent, vanish. 
Hence 

a ik( x k ~ hi £l) = hi a ik( x i ~ 1*1 

t First proved by Weierstrass, Monatsber . d. K. Akad. d, Wiss . Berlin, 1858; Werke , 1, 233-46. 
Several other proofs exist; one is in Lamb’s Higher Mechanics , 1920, 222-6; another in Bromwich’s 
Quadratic Forms , 1906. The last is specially interesting because it proceeds by discovering normal 
coordinates directly and reducing the quadratic forms to sums of squares by successive substitution. 
We have followed Routh’s method because it is the most easily adapted to gyroscopic systems. 
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the second of which reduces to the first on interchanging i and Jc and taking the conjugate. 
Hence 

-411^1 = lil a ik X k' 

If x k = l k 2 % 2 > say, l\id ik x k = 0 by the orthogonal property; hence the £ that is not zero in 
any characteristic solution can be found explicitly in terms of the x i without the need to 
find all the solutions. 


As an example, take 



b = 



xax is positive definite and the roots of det (b — Aa) = 0 are 2, 2, — 3. 
The three equations for A = 2 all reduce to 


x 1 — x 2 — 2x z = 0, 

and we may take solutions to be (1,1,0), (0, —2,1). These give 

IjClJi — 4, ~ 2; 

hence a solution normal to the first is m a = 2(f a + jf 1 ) = (1, —3,2). The solution for A = — 3 is 
= (-3,i,2). 

Then we may write 

x i — £i + St ■“ 3 £s, x t — £i~ 3 £ a + £3, x z — 2£ 2 + 2£ s . 

Also m a am, — 04, X z al z ~ 64, 

hence Stax = 4£*-f 64£* + 64£|, &bx = 8£j + 128£| -192£|. 


To illustrate what can happen if neither matrix is that of a positive definite form, we take the forms 
£(#? — x\) f x x x 2 . The equations required are 

AtUj — x 2 t — Axe 2 — 

and A* = — 1, A = ± i. We can therefore take 

x i = Si + St* x % — HSi ~ St) > 

and £(**-*!) = £i + Sa, *3/ = *(£-£)• 

Thus the forms are reduced to sums of squares, but not with real coefficients. Here a ik l kl lf x = 0. 

Since a hermitian matrix can always be reduced to diagonal form by a transformation 
l~ l al 9 where I is unitary, the condition required in 4-065 that the matrices shall be reducible 
to diagonal form by a collineatory transformation is always satisfied when the matrices 
are hermitian. 

The peculiarity in the case of equal roots may be illustrated geometrically by the problem 
of finding principal axes of an ellipsoid of revolution. Any diameter in the plane of circular 
symmetry is a principal axis. If we arbitrarily choose two directions l l9 l 2 in this plane 
they will not in general be perpendicular; but we can use them to form a coplanar direction 
I 2 — 6l x perpendicular to l x . 

4*084. The stationary property of A: Rayleigh’s principle. For any set of x i we 
can define a quantity A by 

\ _ b ik x k x * = 

a ik x k x i 2 A jj £ ; - £* 


( 1 ) 
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Rayleigh’s principle 

This always lies between the greatest and least of the since all A j} >0. If all the £ } are 
zero except one, the ratio reduces to the corresponding Aj. If the others are small the change 
in A is of the second order. Hence the ratio is stationary for all variations of the and 
therefore of the x t . This is Rayleigh’s principle. 

Conversely, the ratio has no other stationary values. For we have 

Aa ik x k xt = b (k x k xt ( 2 ) 

and if 8X is of the second order we have to the first order for all changes of the x i , both real 
and imaginary changes being permitted, 

a ik x k x i ) = 8(bik x k x i)> (3) 

whence Aa ik x k = b ik x k , Aa ik x* = b ik X$, (4) 

the second of which reduces to the first on interchanging i and k and taking the conjugate 
complex. 

The principle is often useful in numerical work. Without actually for min g and solving 
the determinantal equation, we may be able to see roughly what ratios the x i must have 
in a particular solution, usually the one with the smallest A. If we substitute rough values 
for these ratios in (1) and work out A, it will be correct within a second-order error. This 
may be accurate enough or it may serve as a starting point for a closer approximation to 
the ratios of the If the Ay all have the same sign, the approximate value taken by A 
with any ratios of the x i is numerically greater than the smallest of them. Direct use of 
the principle therefore never underestimates the smallest root.f 

4-09. Small oscillations. If the changes of the coordinates in a dynamical system 
from their values in a position of equilibrium are x i ... x n , the kinetic energy can be written 
T = ^a^x^b, and is a positive definite form. The change of the work function^ can be 
written as W — Then Lagrange’s equations of motion are 

a ik x k ~ b ik x k . (1) 

Assume a solution of the form x i cc e^, (2) 

where y is constant. Then y 2a ik x k = b ik x k , (3) 

and we have a set of equations of the form already studied, y 2 replacing A. It follows that 
all the values of y 2 are real. If any of them is positive the system is unstable; the condition 
of stability is that they shall all be negative. We wish to see what this implies about T 
and W. Referring back to 4-082 we see that the condition is that all the changes of sign 
in the sequence of determinants shall be lost as y 2 increases from — oo to 0. Therefore 
the sequence 

II ~ b ik || ... [ — b n — b 12 — 6 13 , 

— 6 21 —b 22 — b 2Z 

j — 6 31 — b Z2 — & 33 

are all positive; and this is the condition that W is a negative definite form, or alter¬ 
natively that the potential energy V = - W is a positive definite form. 

f A full account is given by G. Temple and W. G. Bickley, Rayleigh's Principle, Oxford, 1933. 

% In most dynamical problems the work function W is slightly more convenient than the potential 
energy V = — W. The exceptions are when W is obviously a negative definite form and therefore 
V a positive one. 


~ "" b 12 

bn-i ban 


-6 


ii> 


(4) 
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Evidently if we make a substitution 

x i = ( 5 ) 

that reduces a^x^ to a sum of squares of the ^ it will also reduce a ik x i x k to sums of 
squares of the Our general argument then shows that T and W can be expressed 
simultaneously as sums of squares (the latter with all the coefficients negative in a stable 
system) even if the determinantal equation has multiple roots. Then Lagrange’s method 
gives independent differential equations for the with regard to the time, and they 
therefore all vary independently. They are known as the normal coordinates of the system. 
In practice this method of obtaining the solution is laborious, and the operational method 
to be described later is much easier if anything more than the free periods is required. The 
present method, however, explains a feature always found in the operational method 
when the period equation has equal roots. At present we need only give an illustration of 
what happens to the normal coordinates in that case. Consider a pendulum free to vibrate 
in any horizontal direction. If the horizontal rectangular coordinates are x x and x 2 we 
have to the second order, in the usual notation, 

2T = m(x\+xl), 2 W = 


Both are already expressed as sums of squares and x l9 x 2 are normal coordinates. But we 


could put 


x x = cos a-£ 2 sin a, x 2 = ^sina-h^cosa, 


and then 2 T = m(£?+£|), 2 W = -^ (£! + ££), 

so that £ v £ 2 are also normal coordinates. In fact the coincidence of the periods implies 
that we can choose the normal coordinates in an infinite number of ways. 


4*091. Small oscillations about steady motion. In these cases, first studied 
systematically by Routh, the equations of motion take the form 


a ik 9ik^k~^ik X k = 


( 1 ) 


where a tk and b ik are symmetrical but g ik is antisymmetrical. All are real, and a ik XiX k is 
positive definite. A sufficient condition for stability follows if we multiply by x t and add; 
the g ik terms cancel, and 


j t (a ik XiX k -b ik XiX k ) 


0 . 


( 2 ) 


The first term inside the bracket is positive definite, and it follows that if b ik x t x k is nega¬ 
tive definite it can never exceed numerically a value determined by the initial conditions; 
hence in these conditions the system is stable. For oscillations about equilibrium this 
condition is also necessary, but for oscillations about steady motion it is not, otherwise, 
for instance, a top could not stand up. The method of solution for given systems is straight¬ 
forward, but some general features will be discussed. 

If we assume a; f oc e* as before we get the equation of consistency 

D = II a ik y 2 + 9iJc y — bile II = 0 . (3) 

If we denote the general element by c ik the matrix c ik is hermitian if y is purely imaginary. 
But if y is real the matrix is real and not symmetrical, so that our previous discussion 
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Oscillations about steady motion 


needs modification. Further, the general element contains terms of three degrees 0, 1, 2 
in y, so that the theory of small oscillations about steady motion is in two respects more 
complicated than that of what we may now call simple hermitian matrices, where terms 
are of degrees 0 and 1 in A. 

Changing y to — y and then interchanging rows and columns leaves D unaltered; hence 
the values of y occur in equal and opposite pairs. But if y is imaginary, the ratios of the 
x i are usually complex and the phases of the coordinates differ in any simple oscillation. 
No reduction to normal coordinates is possible. 

For the stability condition we form the minors D n (= D), D n _ l9 ...,Z> 0 as before. We 
still have 


If y is pure imaginary 


0 8 D 8 _ 2 — 

C n -l,nCn,n- 1 =\Cn-l,n\ 2 >0, 


(4) 

( 5 ) 


and again as y 2 increases from — oo to 0 changes of sign can be lost only at the beginning. 
But if y 2 is positive C n _ ltU and C nn _ x are no longer conjugate complexes and their 
product need not be positive. The root-separation theorem therefore still holds so long 
as y 2 is negative; if then a^x^ is a positive definite form, as it must be, and if also 
bijc x i x jc is negative definite, it follows exactly as before that there must be n negative real 
values of y 2 and the system is stable. This is the case that we derived directly from the 
equations of motion. But it does not follow that if b ik x t x k is not negative definite the system 
is unstable. For here, unlike the case of oscillations about equilibrium, a change of sign 
can be gained at the beginning of the sequence. This did not arise before because n changes 
of sign were lost in any case as y 2 varied from — oo to + oo, and if any had been gained at 
any stage they would have had to be compensated by additional losses, so that we should 
have had an algebraic equation of the nth degree with more than n roots. But changes of 
sign can be gained in a gyroscopic system. Consequently in a gyroscopic system, with 
b ik Xi x ic not negative definite, we cannot assert instability without actually investigating 
the period equation in detail. 

Again, if the system is unstable, y 2 may not be real. For the root-separation theorem 
fails for positive y 2 , and without it changes of sign may be lost when y 2 passes through a 
zero of some intermediate member of the series. Thus there may be a loss of an even 
number of changes of sign not accounted for by real zeros of D, which must therefore have 
complex zeros. 

Since the root-separation theorem is true for b ik x i x k negative definite it still follows 
that if there is an r-fold zero of D for negative y 2 the matrix c ik is of rank n — r for that value 
and all first minors of D have (r -1 )-fold zeros. But this is not true if b ik x i x k is not negative 
definite. 


4*092. These considerations are illustrated by the upright top if we take the coordinates 
x l9 x 2 to be direction cosines of the axis with respect to two perpendicular horizontal axes. 
The equations of motion are, in a usual notation, 

Ax x -f Cnx 2 - Mghx 1 — 0, Ax 2 - Cnx x - Mghx 2 = 0, (1) 

so that &i**<®* = Mgh(x f + x\) f (2) 

and is positive definite. 
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Upright top 

We put x v x a proportional to e y( ; then 
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{Ay 2 - Mgh) x 1 = - Cnyx 2 , (Ay 2 - Mgh) x 2 = Cnyx v 

D 2 = (Ay 2 - Mgh) 2 + C 2 n*y 2 = 0, Ay 2 ± iCny - Mgh = 0, 

-f _ L j iCn ^ i (C 2 n 2 4Mgh \* 

7 ~-*~A ± 2\A 2 A~) * 


(3) 


These values are all purely imaginary if C 2 n 2 > 4AMgh. If C*n 2 < 4AMgh two roots have 
positive real parts and the top is unstable; but y 2 is complex, whereas it would be real and 
positive for an unstable position of equilibrium. The motion of the end of the axis is an 
equiangular spiral, to the first order. 

If G*n 2 = AAMgh the roots become equal in pairs. Taking y = — ^iCn/A we finH that 


the matrix of the coefficients is 2Mgh ^ ^ , which is of rank 1. Hence there is 

only one solution of the form e* with this y, and similarly with the opposite sign for y. 
The other solutions are of the form te*, which do not occur at all in oscillations about 
equilibrium. 

We have D x = Ay 2 —Mgh, D 0 = 1, 


and the signs run as follows, failing to reveal any zeros of D a for negative y 2 : 


V 2 


Di 


— 00 

+ 

- 

+ 

0 

+ 

— 

+ 

oo 

+ 

+ 

+ 


If, however, we discuss the top hanging downwards, b ite x { x k is negative definite. We 
need only reverse the sign of g. In this case there cannot be equal roots; and the signs in 
the last table run: 


Y 2 

A 

D 1 

A, 

— 00 

+ 

- 

+ 

0 

+ 

+ 

+ 

oo 

+ 

+ 

+ 


so that two changes of sign are lost between — oo and 0. 

4*093. The imaginary values of y have a stationary property for gyroscopic systems. If 
we take y so that 

.a _ -y&fcSfeS* 

a ik x k x i 

we see that the equations 4-091 (1) are equivalent to the statement that y is stationary for 
both real and imaginary variations of the x { . For any complex x it x k , g r ik x k x? is purely 
imaginary. We take it not zero. Put for any solution of the form x t = 

a ik x k x i = P> 9ac x k x * = iQ, b ik x k xf = R. 

Then 2Py = - iQ ± (- Q 2 +4 RP)K 


ro -2 
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If Q* < 4 RP both roots are complex, and | y 2 1 = RjP. Thus if the system would be 
unstable in the absence of gyroscopic terms, the real part of y is less than it would be for 
the same x t if the g ik were all 0. But this, by Rayleigh’s principle, is less than for the actual 
values of the x t in the mode without gyroscopic terms that gives the largest y. Hence the 
gyroscopic effects tend to reduce instability. 

If Q 2 > iRP both roots are imaginary and the system is stable. If R is a negative definite 
form, the system is stable for all Q, and if for oscillations with g ik = 0 there are two 
roots + icr, the gyroscopic effects will increase the speed of vibration for one and d iminis h 
it for the other. 

4*10. Roots of unitary and orthogonal matrices. Let a ik be a unitary matrix of 


n rows and columns. Then 

a ik a im = ^fan- (1) 

Consider the equations a ik x k ~ hXi* (2) 

In general the roots of the determinantal equation 

IK*-A*J|=0 (3) 

will be complex. Then the conjugate of (2) gives 

af m x* = A*xf. (4) 

Multiply (2) and (4). Then a ik at m x k x^ = AA**^?* (5) 

But on account of (1), (5) reduces to 

x k xt = AA**^?, (6) 

and therefore AA* = 1. U) 

Hence all the roots of a unitary matrix have modulus 1. 

Now let Ai and A 2 be two different roots of (3) and x iv x i2 corresponding values of the x t . 
Then proceeding as before we get 

kl x m2 ~ A 2 A *X il X* 2 , (8) 

and also = x ki x tz- 

Hence either A X A* = 1 (10) 

or x il x% = 0. (11) 


In the former case A x = A 2 , which we have already considered. Hence the solutions have 
a property of complex orthogonality similar to that possessed by a set of hermitian 
equations. 

A unitary matrix a can always be transformed to diagonal form by a transformation 
l_i a I w here I is unitary.f This holds in particular for a real unitary matrix; but I in general 
remains complex. 

4*101. Consider now the most general 2x2 unitary matrix. As in 4*032 we write 



t D. E. Littlewood, The Theory of Group Characters, 1940, 16. See also Ex. 10, p. 169. 


( 1 ) 




4*102 

Then 


2x2 unitary matrices 


lit 


_ laa* 
~ \ca* 


aa* + bb* 
+ db* 


ac* + 6d*\ _ /I 0\ 
cc*+dd*j~[ 0 1/’ 


and we can take 


a = cos a e 1 ^, b = — sinae^, 
a* = cos a e~V, 6* = — sinae -4 *. 


c 1 

with a, /?, y real. Then - 3 = —- = tan a e^-y) 

r ' da * 

-jt = — ~ — tanae-^-rt, 

dt CL d 

and these are equivalent. Also 
cc* 

-r—r = tan 2 a, dd* = cos 2 a, cc* = sin* a, 
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( 2 ) 

(3) 

(4) 


( 5 ) 


and all the conditions are satisfied if 

d = cos a e", c = sin a eW-7+*), 

Take fi+$ = 2e, /3—S=2rj. 

Then ,/cos ae <( « + * ) — sinae 4 ^ _ i / cos a e** —sinae^ - *^ 

Xsinae*®*-*) cosae*<*-*y 6 \sin a e ,(e ">' ) ‘ cosae - *’ / 
_ W e<9 0 Woos a — sinaWe^ 0 \ 

~ e \0 e~ 0 ) \sina cos a /\0 er 4 *)' 


where 


0 = Uv+Y-e), f = Mv-y+e)- 


( 6 ) 

(7) 

( 8 ) 
(9) 


Thus whereas the most general real orthogonal 2x2 matrix can be expressed in terms of 
one angle, the most general unitary one needs four. 


4*102. General rotation in three dimensions in terms of a 2x2 unitary 
transformation. Let A t = (A 1 ,B ly C 1 ),A 2 = (A 2 ,B 2 ,C 2 ) be two real vectors, equal in 
magnitude and perpendicular, that is, 


Al + Bl + C* = Al + Bl + Cl A^ + B^+C^ = 0 . ( 1 ) 

Four of the real quantities A v B v C l9 A 2 , B 2 , C 2 can evidently be assigned independently. 
Under an orthogonal transformation both vectors preserve their magnitudes and remain 
perpendicular; and if under a transformation this is true for all pairs of equal perpendicular 
real vectors the transformation is orthogonal. 

Form the ‘complex vector’ ( a,b,c) such that 


d — Ai -f iA 2 , h — + lB 29 c — Cj -f* id 2 m 

Then by (1) a 2 -f ft 2 + c 2 = 0, 

whence (ib — a) (ib -fa) — c 2 . 

We may call (a, b, c) a complex null vector; and x 1) x 2 exist such that 


*!> ' 

a — 4(^1 

*1*2. 

6 = ^( a; !+*i). • 

X 2> J 

c = x x x 2 . 


( 2 ) 

(3) 


( 4 ) 
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Then x v x 2 are determined except for sign if (a, b, e) are given; and conversely for any 
assigned (complex) x v x 2 (4) will give (a, b, c) satisfying (3), and (1) follow on equating real 
and imaginary parts. Also 

| o a | +1 b* | +1 c 2 1 = ifoaf +x 2 x$) a , 

and also = (A\ + B{ + C\) + (A\ + B\ + C\). (5) 

If x x ,x 2 undergo a linear transformation to y x ,y 2 such that 

x x xt + x 2 a% = y r yt+ y 2 y*, (6) 

a', b’, c’ defined by replacing x v x 2 by y v y 2 in (4) will be linear in x\, x\, x x x 2 and therefore 
in (a,b,c)‘, and 

|a'2| + |6'*| + |c'*|='|o a | + |6 ! >| + |c*|. (7) 

It follows that the vectors A{, A' 2 derived from a', b', c’ are equal in magnitude to A x and 
A z and mutually perpendicular. But in general A' x will depend on both A x and A 2 . We 
want a real transformation that will represent a general rotation of A x into A[, and 
A 2 into A 2 , where A[ is independent of A 2 and A 2 of A t . This requires a further re¬ 
striction, namely that the real and imaginary parts of a, b, c transform separately. We 
proceed to find in what conditions the transformation given by (6) will have this property. 
Now the most general unitary transformation of two variables can be put in the form 

y x = {olx x + fix 2 ) e u , y 2 = ( - fi*x x + a*x 2 ) e ic , (8) 

where e is real and aa*+/?/?* = 1. Then if 

a' = \ky\-y\)> b'= j.(yl+yl), c' = y x y 2 , (9) 

we find that 

a' e -* u = £(a 2 + a* i -p 2 -/l* 2 )a + ^ i (a*-a* i +/P-/3**)b-(a/3 + a*/?*) c, 

b > e -tu = L<-a? +a **+/i*-P**)a+U**+** z +/P+fi**)b + \(a/3-a*/3*)c, - (10) 

2t % 

c'e-to = (a/?* +a*fi) a + \ ( cc/J* - a*/?) b + (aa*~ /?/?*) c. 

The coefficients on the right are all real. Hence the condition that the real and imaginary 
parts of a, 6, c shall transform separately is that e~ 2U is real and therefore ± 1. In either 
case the transformation is orthogonal. If a = 1, /? = 0, it is seen at once that the deter¬ 
minant of the coefficients on the right is 1. For any a, /? we can take e = 0, and the trans¬ 
formation, being orthogonal, must give determinant ± 1; and as it is a continuous function 
of a and fi it must therefore always be 1. Hence if e~ 2ic = 1 the transformation is a rotation. 
If c - 2 u _. — l it is a reflexion. (With a lavish expenditure of paper the determinant can be 
shown to be identically (oca *+/?/?*) 3 .) Hence we can represent the most general rotation 
by taking e = 0 in (8) and (10). 

a fi \ _ /cos — sin 
— fi* a*) ~ \sin£0 cos£0 / 


In particular, take 


( 11 ) 
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Then A\ = A x cosd + C x smd, B x = B x , C[ = — ^sintf-f-C^cos#. (12) 

This represents a right-handed rotation through 6 about the y axis. 

If a = e-W, p = 0, (13) 

we have A x = A x cos \[r — B x sin \jr, B x = A x sin \jr + B x cos ijr, C x = C x , (14) 

which represents a right-handed rotation through r}r about the z axis. 

If a = cos£0, P — (15) 

A x = A x , B[ = B x co8<fi — C x sm(f>, C x = ^sin^ + C^cos^, (16) 

which represents a right-handed rotation through <j> about the x-axis. 


The most general unimodular unitary transformation can be built up from operations 
of this type. The transformation given by the matrix 


/ cc p\ _ (e-** 0 \ loosed — sin£0\/e - *** 0 \ 
\ — p* a*/ ~ \ 0 e* iA / \sin \d cos \6 ) \ 0 e***/ 

(cos — sin \de~** x -x>\ 

= \sin cos £0e** A +x> } 9 


(17) 


leads to the general rotation in the 3-space given in 4*034. To each such rotation there 
correspond two 2x2 transformations. 


4*11. The Pauli spin matrices. In the x, y, z space small rotations e about the 
axes x, y, z respectively are given by the three matrices 

( 10 0 \ /I 0 e\ /I 0\ 

0 1 —e j , ( 0 1 0 J, I e 1 0 j, (18) 

0 e 1 / \-e 0 1/ \0 0 1/ 


where powers of e above the first are neglected. Corresponding 2x2 unitary transforma¬ 
tions are given by the matrices 


If we write 





the matrices (19) are 1 — \iecx Xy 1 — %iecr 2 , 1 — \iecr z . 

The matrices (20) are the ‘Pauli matrices’, which occur in the theory of electron spin 
in quantum mechanics. 

We can verify that 

(r| = 0*1 = 0*1 = 1, <T X 0’ 2 — — 0*2^1 ~ ^2^3 = ~ ^3^2 ~ ~ “ i<?%* 


The matrices <r l9 c r 2 , cr s therefore anticommute. 
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4*12. The Eddington and Dirac 4x4 matrices. Consider the 4x4 matrices 


*-C? * -c mt & 

where, for example, /0 i 0 0\ 

°\ = /* 0 0 °\, 

\ 0 ia-J 1 0 0 0 i I 

\0 0 i 0/ 

the elements cr being written out as indicated and the inner enclosing brackets then 
removed. By 4*063, multiplication may be carried out on the 2x2 matrices of which the 
elements are themselves 2x2 matrices, and the results then expanded into 4x4 matrices. 
Then 

*-(7' -J— 1 - 

and similarly the squares of E 2 , E Zi E A are — 1. Also the four matrices are easily shown to 
anticommute. If we now write 




we have E\ = — E 1 E 2 E Z E^E 1 E 2 E 3 E 4 = E 1 E 1 E 2 E Z E A E 2 E Z E A 

7JT Jp 77T jp Jp Jp Jp Jp Jp Jp | 

= “~E 2 Hj 2 Hj z — —1, 

and iE x E$ = —E 2 E 3 E±, iE 5 E 1 = E X E 2 E Z E±E X — — E 1 E 1 E 2 E Z E± = E 2 E Z E A . 

*> 

Thus E h anticommutes with E v and similarly with E 2 , E 3 , and E t . 

Thus we have an anticommuting pentad of square roots of —1. Written in full they 
are as follows, in order: 


b i 0 0 

i 0 0 0 
0 0 0 i 
0 0 i 0, 


i o o o 
0 -i 0 0 
0 0 i 0 
0 0 0 - 


0 0 0 i 

0 0 -i 0 
0 -i 0 0 
i 0 0 0. 


0-10 O ' 
10 0 0 
0 0 0 1 
0 0-10 


0 0 0 -1 

0 0 10 

0-100 
10 0 0 


Any matrix E^E v (/i^v) has square -1. Denoting it by E^ we find 

/ 0 1 0 0\ / 0 
, faz <>\ _/ -1 0 0 0 \ p = ( 0 **) = I 0 

'““(o i<rj 1 0 0 0 1 )’ 18 l-o's 0/ 1 -1 




0 0-10/ 

i 0 0 0 

0 — i 0 0 

0 0 -i 0 

0 0 0 i 


0 0 10 
0 0 0 -1 
-10 0 0 

0 10 0 


0 0 i 0 


= / 0 i<r 8 \ = 0 0 0 -t 

~~ l «>3 0 / “ I » o 0 0 I’ 


0 -i 0 0 
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E. 


28 




M-l.T 1 )- 
MT -%) - 


Oblique, axes 
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E. 


24 


0\ 

\ 0 itrj 


Esi ~ (-^l Q 1 )- 


E. 


45 


-C V) - 



We thus have fifteen matrix square roots of —1, to which we can add a sixteenth, 
E le = il. It can be shown that the first fifteen form six anticommuting pentads, each E 
being a member of two pentads.* All are antihermitian and unitary. 

The matrices introduced by Dirac in the solution of the relativistic wave equation are 
anticommuting square roots of 1 and are connected with a pentad of Eddington’s by the 

& 1 “ ^ 25 > = ~ ^ 15 ) P “ ^ 35 > Ct\C^%CL^P = %E^. 


4*13. Oblique axes. So far the axes of reference that we have considered have been 
rectangular. The position of a point could, however, be expressed either by the orthogonal 
projections on the axes of its displacement from the origin, or by displacements in 
any three non-coplanar directions that will add up vectorially to the displacement from 
the origin to the point. The second method corresponds to the usual resolution into oblique 
components; we shall see that the first also has a physical significance. The characteristic 
feature of rectangular axes is that the two sets of quantities are then identical. Let us 
see what happens with oblique axes. Suppose that we have a set of rectangular axes x i 
and take three oblique axes x] with direction cosines l tj with respect to the rectangular 
ones. For a reason that we shall see in a moment we denote the oblique coordinates by an 
index instead of a suffix, thus, x'K Then the rectangular coordinates of a point P will be 
Iqx'l as before. Since ly for fixed j are the direction cosines of a single line, 



II 

J—' 

(i) 

but in general for j + 1 


(2) 


% 


since this is the cosine of the angle between the directions of x' J ' and x' 1 , which are not 
perpendicular. The distance of a point from the origin is r, where 

r 2 = zj = (lijX’l) {l a x’ 1 ) 

= lylux'lx' 1 , (3) 

* Eddington, Relativity Theory of Protons and Electrons, 1936. Eddington does not write the 
matrices out in full, but they are implicit in equations 3*61, p. 42. The antisymmetrical ones appear 
with reversed sign; this is due to the fact that in an element a ik he takes i to refer to the column and 
k to the row, contrary to the usual convention. In Fundamental Theory , 1946, p. 142, he gives the 
matrices in detail with the present convention; his E 31 is — E 1Z . 
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which is a quadratic form, and can be written 

r* = g' n x'1x\ 


4*13 


(4) 


(5) 

( 6 ) 


where < 7 U — g M — g^s — 1> but g^< ! 73 i> !7i2 ar e not 0. If the axes x'j are rectangular, but 
not otherwise, g' }l — S jt . 

Now consider the orthogonal projection of OP on the direction of and denote it by 
x'j with a suffix. It is 

= hj x i - = 9ji x '‘ — (&"*), 

and x'jx' 1 = g'^x^x'* = r 2 . 

Thus the form x t x { correct for rectangular axes needs to be modified for oblique ones by 
raising one of the suffixes. We cannot now write it as x'f. 

The position of P can be specified equally well by the values of either x' } or x’*; but the 
actual magnitudes of the three quantities are different. The former are called the covariant 
components and the latter the contravariant components of the displacement OP. It is 
commonly thought anomalous that the adjectives are not interchanged! But the covariant 
component in the direction Oj has the property that for a given position of P it is inde¬ 
pendent of the directions of the other two axes j this is not true of the contravariant 
component, and in this respect the covariant components might appear to be the more 
fundamental. On the other hand, if we vary one contravariant component without 
altering the other two, we know directly how much and in what direction the point is 
displaced. This is not obvious for variation of one covariant component. Both systems 
therefore have their advantages and we need a means of transforming from either to 
the other. 

We write flj&fl = <?', 

denote the cofactor of g' fl in O' by 0' il , and put 

g'll = O'iifQ' = 

Then g' il is the reciprocal matrix of g jt , and 


7 ’ii 


(V 


( 8 ) 


*''= g'%. 


(9) 

( 10 ) 


This formula with its companion 

x i = 9ji x ‘ 

are known as the formulae for raising or lowering indices. The determinant O' is seen to 
be fundamental. If we denote the angle between the x[ and x' 2 axes by a, and so on, 


O' = 


1 

cosa 8 
cos x 2 


COS Oj 

1 

cos a. 


cosa 2 

COSO! 

1 


9it = cosoj, (11) 

= 1 - cos 2 a t — cos 2 a 2 — cos 2 a 8 + 2 cos oc t cos a 8 cos a 8 . (12) 


But g' 11 , g' 22 , g' 33 are not in general equal to 1 or to one another. 

Alternatively O' = || l i} l u || = || || || l u || == || l {j || 2 . ( 1 3) 

But | l i} | is the volume of the parallelepiped with edges of unit length along the axes, 
or alternatively the continued scalar product of direction vectors along the axes. It could 
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therefore be zero only if the axes were coplanar. The element of volume is therefore 

(TWWW 8 . 

If A is a general vector, the resultant of components A' 1 , A' 2 , A' 3 along the oblique axes, 
its rectangular components will be 

Ai = lijA'l, (14) 

and its covariant components will be 

Aj = g#A' 1 = l^A^ (15) 

Now let <j> be a scalar and consider its derivatives with regard to x'K We have 

dx'* dx' j dx 4 dx { 9 ' 

and d<j>jdx'i are therefore the covariant components of grad^. To get the contravariant 
components we must multiply by g' jl and contract. 


If A and B are two vectors, 


A'fB'i = kjA.B'l = A t B t , 

(17) 

which is their scalar product. Similarly, 


dA'1 dA'l dA { 

dx'1 ~ lii dXi ~ dx t ’ 

(18) 


which is the divergence of A . 


As for rectangular axes we can define tensors of any order, but the transformation rules 
will be different according as each index is upper or lower. We can contract with respect 
to a repeated index provided that it is upper in one occurrence and lower in the other, 
and the result will be a tensor of order lower by 2. 

4-131. Crystal structure: the reciprocal lattice. A simple three-dimensional 
lattice is specified if three fundamental displacement vectors are given. Taking any atom 
as origin, there will be a similar atom at any point + n 2 a 2 +n 3 a 3 , where a v a 2 , a 9 
are the displacements to three neighbouring atoms not in the same plane as the origin, 
and n v n 2 , n 8 are integers, positive or negative. The points so specified are called lattice 
points. A plane through any three lattice points will include similar atoms in a repeating 
pattern. Its direction can be specified by its intercepts on axes through the origin in the 
directions of a ly o 2 , a 3 ; let these, divided by a suitable integer, be %/Aj, a 2 /h 2i a 9 /h 9 , where 
h l9 h 2 y h 9 are integers with no common factor. Then h l9 h 2) h 9 are called the indices (if iZZer 
indices) of the plane,* and are the same for all parallel planes. 

We shall denote the volume cq. a 2 a a 9 of a single cell of the lattice by v a . 

The set of vectors b v b 2 , b 9 reciprocal to the set a l9 o 2 , a 9 is defined as 


K _ fl 2 Afl 3 K _ fl 3 A ° 1 K _®l Aa 2 

U 1 — “ 9 u 2 — " 9 u 3 — ” • 




They satisfy the relations 
and v b = b 1 .b t Ab 3 = 

_ ^1 • {(^8 • g a) a i — (^2 • tt l ) g a} _ 


( 1 ) 

( 2 ) 


(3) 


* The suggestion for this specification was made by Grassmann and others, but first became 
popular through Miller’s Lehrbtich der KrisiaUographie (1863). 
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Any vector A can be expressed by 


4*131 


A = (A.b 1 )a 1 + (A.b 2 )a 2 +(A.b 3 )a 3 ( 4 ) 

= (A.«4)5!+(.4.a 2 )6 2 +(A.a 3 )6 8 . (5) 

If, therefore, we build up a lattice with b v b 2 ,b 3 as fundamental vectors, the volume 
of a cell is the reciprocal of that in the original lattice. 

Since ®s/^s points in a lattice plane, the directions of the vectors 


®1_®1 

^1 ^2 ^1 ^3 


are parallel to this plane; but both these directions are perpendicular to h ^ + h 2 b 2 + h 3 b 3 , 
and this displacement in the reciprocal lattice is normal to the planes of the atomic 
lattice with Miller indices h t> h 2 , h 3 . The equation of any of these planes can therefore 
be written 

(h 1 b 1 + h 2 b 2 + h 3 b 3 ).x _ 

| hjb t +h 2 b 2 +h 3 b 3 | ~ P> (6) 

and p is the perpendicular from the origin on to it. Now if a; is a lattice point it is of the 
form n 1 a 1 +n 2 a 2 +n 3 a 3 , then 


(^i&i + h 2 b 2 + h 3 b 3 ). (jii®! + n 2 a 2 + n 3 a 3 ) — h x n x + h 2 n 2 + h 3 n 3 . (7) 

Now if h v h 2 ,h 3 have no common factor we can choose n v n 2 ,n 3 so that the sum on the 
right is 1. For, first, if h v h 2 have a common factor q the process of finding the highest 
common factor enables us to determine s lt s 2 so that 5 1 A 1 + « 2 A 2 = q. Similarly, if h 2 , h 3 
have a common factor r we can find t v t 2 so that t x h 2 + t 2 h 3 = r. But, by hypothesis, q and r 
have no common factor; therefore we can find a linear combination of these expressions, 
with integral coefficients, that is equal to 1. This can be taken as A 1 TO 1 + A 2 re 2 + ii s « 8 . 
Evidently h x n t +h 2 n 2 +h 3 n 3 can be made 0 by taking n x = n 2 = n 3 = 0. Then the per¬ 
pendicular distance d h between the planes with the corresponding values of# is equal to 
| A 1 b 1 +^ 2 b 2 + k 3 6 3 |- 1 , and this is the spacing of the crystal planes with Miller indices 
h x , h 2 , h 3 . 



Now consider a parallel beam of X-rays falling on a crystal. Suppose that plane waves 
travelling in a direction s 0 fall on the atomic lattice and are scattered by the separate 
atoms. We want the condition that those travelling away in a direction s shall reinforce 
one another. The difference of path for waves scattered by two atoms separated by o # is 
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(s—s 0 ).c^, and the condition for reinforcement is that this shall be a multiple of the 
wave-length A for all atoms in the region. Hence 

(s—s 0 ). ttj = kjX (j= 1,2,3), ( 8 ) 

where the are integers. Then from (5) 

s — Sq = 4* ^ 2^2 "i" ^ 3 ^ 3 )) (9) 


or, supposing k v Jc 2 , Jc z to have a common factor n, 

S — Sq — 7lX{h x b x + k 2 b 2 "t" ^3^3)* 


( 10 ) 


Now take an origin at a point O of the reciprocal lattice. The geometrical meaning of 
( 10 ) is that if P is a point such that PO = s 0 /A, then if PQ is s/A, Q must be a point of the 
reciprocal lattice, and all such points that lie on a sphere of radius 1 /A about P will corre¬ 
spond to reflexions of rays of wave-length A. Moreover OQ is parallel to the external 
bisector of the angle OPQ , and hence the reflexion can be regarded as taking place at 
planes with Miller indices (h l9 h 2 , h z ) whose normal is in the direction OQ, 

If we write 2 0 for the angle OPQ , so that 0 is the angle of reflexion, we have 

n 12 

— — 71 j h x b x ~\~k 2 b 2 "t" ^ 3^3 j — ^ | s — Sq J = ^sin 0 ( 11 ) 

or nX — 2d h sin d, ( 12 ) 

which is Bragg’s reflexion condition. 

Further, by squaring ( 11 ) we have that 

n 2 A 2 1 h x b x -f h 2 b 2 + h z b z | 2 « 4sin 2 0, (13) 

and also from ( 10 ) since 

s 2 = j 7iX{h x b x + h 2 b 2 -f* ^ 3 ^ 3 ) + Sq 1 2 (14) 


that 


2s 0 . (^1^1 ~h h 2 b 2 + h z b z ) 
| h x b 2 + h 2 b 2 + h z b z J 2 


Further developments are given by P. P. Ewald.* 


(15) 


4-14. Curvilinear coordinates. For many problems it is convenient to specify the 
position of a point by three functions of the rectangular coordinates that are not constant 
over planes. We may, for instance, use spherical polar coordinates; then the coordinate 
r is constant over a sphere with centre at the origin. If we continue to denote rectangular 
coordinates by x { and call the curvilinear ones x'\ we shall have 

** = a) 


,. dx'l j 


d(f> dx'l d(f> 

dx { dxi dx' j ’ 


( 2 ) 


so that the summation convention is still applicable. But the partial derivatives are no 
longer constant. In ( 1 ) x i must of course be regarded as a known function of the three 
z'1, which are permitted to vary independently; in ( 2 ), x’i is conversely regarded as a 


* Kri8taUe und Rontgenstrahlen, 1923, particularly Notes 1 and 2. 
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function of tho three x^. The relations between the coordinates are no longer linear , but 
those between small changes are linear, and so are those between derivatives of a scalar. 
Sets of quantities that transform like dx'l are called contravariant and those that trans¬ 
form like d<f>jdx’l are called covariant, as for oblique rectilinear axes. If ds is the distance 
between two neighbouring points we have 


where 


ds 2 = dx\ = g'fldx’idx’ 1 , 

_ dx { dx { 

9 * ~ dx'i dx' l ‘ 


(3) 

(4) 


Just as for oblique rectilinear coordinates we can form the 
g' il defined by 

,.. dx'i dx't 
** dx* dx i 


reciprocal set of quantities 


(5) 


/ , jn _ dx i dx i dx'i dx' 71 _ / dx i dx^Y/dx* dx' n \ 
dx'i dx' 1 dx* 6 dx k \dx'1 dx k ) \ dx' 1 dx k / 

^ . dx* dx' n dx' n doc 1 * 

ik dx n dx k ~ dx k dx n ~ ln ‘ ^ 


Covariant and contravariant tensors of the second order can be defined according as 
they transform under further changes of coordinates like g# or g'K It may be verified, 
as an exercise in differential calculus, that if we take a third set of coordinates x" a we get 
the same form for by transforming first from x* to x ,j and then to x* a as if we trans¬ 
formed to x" a directly. 

All information about magnitudes of displacements for small changes of the curvilinear 
coordinates is summarized in the g' n . If we vary x' 1 without varying x' 2 and x' 3 the dis¬ 
placement will be (\/(< 7 ii) dx' 1 , 0 , 0 }. The three component displacements for separate 
changes of the new coordinates will not, however, in general be at right angles. The 
inclinations are easily found in terms of the g# as for oblique coordinates, but are seldom 
required. We usually choose our coordinates so that the displacements corresponding 
to small changes of the x /? ‘ separately will be at right angles; and the condition for this is 


dx* dx* _ 

d^W l= ° 


U*l), 


or g'ji — 0 (j*Z). 

If these conditions are satisfied the new coordinates are said to be orthogonal . 

As an example let us take the x ,j to be spherical polar coordinates r, 6, A. Then 


(7) 

( 8 ) 


x x = r sin Q cos A, x 2 — r sin 6 sin A, x 3 = r cos d, 
dx 1 = sin 0 cosAdr + rcos 0 cos \dd — rsinflsinAdA, 
dx 2 = sin0sinAdr + rcos0sinAd0 + rsin0cos AdA, 
dx z = cos 6 dr — r sin Odd, 

and (<fa?i ) 2 + (dx 2 ) 2 (dx B ) 3 = dr 2 + r 2 dd 2 + r 2 sin 2 OdA 2 . 

Hence 17n ” 1» ^22 ” ** 2 > ^33 = r 2 sin 2 &. 
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The determinant of the transformation is (g'^g'^g'^^ = r 2 sin 6 . The inverse trans¬ 
formation is 

dr = sin 6 cos A dx x +sin 6 sin A dx 2 +cos 6 dx z , 
rdd — cos 6 cos A dx t -(- cos 6 sin A dx 2 —sin 0 dx 3 , 
reinddX = — sinAda^+cosAcfo,' 


The reciprocal matrix to ## is 




fi , ll?22£733 



0 0 

g^g'n o 

o g'ngk 


( l 0 

I 0 r- a 

\0 0 


: ) 

r~ 2 cosec 2 0 / 


We have here a decided difference from any rectilinear coordinates. The non-zero elements 
in and g’ jl are now not merely unequal but of different dimensions. In fact in spherical 
polar coordinates the contravariant components of a displacement are dr, dO, dX. We can 
define a set of covariant components by 

g'idx' 1 = (dr, rHO, r 2 sin 2 ddX). 

But neither set are the physical components. The latter would be taken to be component 
displacements with regard to rectangular axes at the point, and would be (dr, rdd, r sin OdX). 
The physical components are all lengths. In practice we are usually concerned with the 
physical components. We denote g lv g 22i g^ by h\, h\, A 2 , and small changes of the curvi¬ 
linear coordinates make displacements ds v ds 2 , ds z ; then we have for the physical com¬ 
ponents 

ds x = h t dx' 1 , ds 2 = h 2 dx' 2 , ds 3 — h z dx' z . 


The same relations hold for components of velocity. If we have for rectangular axes a 
relation between vectors of the form 

x i = 3 <f>jdx i9 


this transforms directly to any set of orthogonal axes, and, written 
ponents, will be 


8 i = 


3 s 4 


h t x' 1 = 


d(f> 

dx‘ 


u> 


etc. 


in physical 


com- 


Multiplying or dividing by the corresponding h we get the same equations written as 
relations between covariant and contravariant components. 

The derivative of a vector A if in curvilinear coordinates, does not in general transform 
like a second-order tensor, on account of the variation of g' u with position. This is the 
greatest complication of the tensor treatment in curvilinear coordinates. It can be 
overcome by a suitable modification of the derivative, but this would take us beyond the 
scope of this book.* 

The rules for transforming coordinates will work equally well even if distances between 
neighbouring points cannot be put in the form S ik x i x k by any choice of coordinates. On 
a sphere, for instance, we can express the position of any point by two variables, but there 
is no way of choosing variables x x , x 2 so that in all neighbourhoods on the sphere 

ds 2 = dx f + dx\. 

* Full accounts are given by A. J. McConnell, Applications of the Absolute Differential Calculus, 
1931; T. Levi-CivitA, The Absolute Differential Calculus, 1927. 
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The theory of transformations when the quantity corresponding to distance cannot be 
put into Euchdean form without introducing new dimensions is the basis of Riemannian 
geometry and of the general theory of relativity. 

4*15. Electromagnetic theory. The tensor method can be extended to four dimen¬ 
sions, and then forms a convenient way of stating the equations of electromagnetism. If 
we consider the quantity 

ds 2 = dx\ + dx | + dx\ — cW, (1) 

where c is the velocity of light, ds taken between two neighbouring events (each specified 
by three position coordinates and the time) is the same for all observers even if they are in 
uniform relative motion. This statement is equivalent to three physical ones: (a) A particle 
moving with uniform velocity relative to one frame is moving with uniform velocity 
relative to the other. (6) Both observers attach equal values to distances at right angles 
to their direction of relative motion, (c) Both observers find the same value for the velocity 
of light, in whatever direction, (a) and (6) are taken over from Newtonian physics, (c) is 
an additional rule required by the Michelson-Morley experiment. Now if we write 
# 4 = id, ds 2 reduces to the sum of four squares and can be treated like the square of a 
distance in Euclidean geometry, except that we now have to work in four dimensions. 
Since it is the same for all frames of reference the transformation from one to another is 
an orthogonal transformation in four dimensions. 

Let us denote a new frame by accented letters and take the axes of x 2 and x 2f also x z 
and x' Zi at right angles to the direction of the relative motion. Then x 2 = x 2 , x 3 = *3> and 

x\ + x\ = x' x 2 + x' A 2 . (2) 

This is satisfied if 

x[ = a^cosa — # 4 sina, x 4 — a^sina + a^cosa. (3) 

The origin of the second frame has zero velocity in that frame; hence if we take dx'Jdx 4 = 0, 
dx x \dx 4 will be v/ic, where v is the velocity of the second origin with respect to the first 
frame. But this gives 

tana = |> (4) 

- fir .J^r r'-^r fir /rx 

X 1 — P X 1 + — X 4> x 4 — — X \~~P X to (5) 

where (6) 

This transformation, due to Larmor and Lorentz, is a complex orthogonal transformation, 
not a unitary one; it satisfies ll = 1, not W — 1. For real x t and t it leaves x[ and t' real. 
The ordinary transformations due to rotation of the axes of x l9 x 2 , x 3 , leaving x 4 unchanged, 
can of course be superposed on it. Sets of four quantities, defined for each system of 
reference, that are transformed into one another by the Larmor-Lorentz transformation, 
can be called components of four-vectors. The fundamental four-vector is x a (oc = 1,2,3,4) 
itself. But since ds is a scalar, mdxjds is a four-vector, where m is any other scalar. If m 
is the intrinsic mass of a particle, supposed the same in all frames of reference, and u is 
the resultant velocity, it follows that muJ*J(l — u 2 /c 2 ) is a four-vector, where 

u 4 = dxjdt = ic. 
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The first three components correspond to those of linear momentum in Newtonian 
dynamics, the Newtonian mass being replaced by mj^j(\ — u z jc i ). Then if we denote the 
first three components by italic suffixes when treating them separately from the fourth, 


( mu a mic \ 

V(l-« 2 /c 2 )’ V(l-« 2 /c 2 )/ (7) 

is a four-vector. 

Again, since the transformation is orthogonal, the four-dimensional volume element 
dx^dx^dx^dx^ is unaltered. Hence the three-dimensional element dx = dx^dx^dx^ trans¬ 
forms like l/<fc 4 , that is, like 


ds 

dx i 



( 8 ) 


If we define density as intrinsic mass per unit volume it therefore transforms like 
m(l —« 2 /c 2 )- % ; an d a momentum per unit volume like the product of the density and the 
velocity. Comparing with (7) we see that if p is the density 

(pu a ,icp) ( 9 ) 

is a four-vector. In the same way, assuming that the electric charge of a particle is the 
same in all frames of reference we have that, if p is the electric charge per unit volume and 
j a the current density, (j a , icp) is a four-vector. It follows that 

,... dp .dp 

divj + tc^ = divj + f t ( 10 ) 

is a scalar and unaltered by transformation. It is actually zero; 

% + divy = ° ( 11 ) 

by the equation of continuity. 

The pair of Maxwell’s equations 

ccurlff —^ = 471 j, ( 12 ) 


now show that 


divE — Anp, 
(curl H) a - ^2, ic div uj 


(13) 

(14) 


is a four-vector. Here E and H are the intensities of electric and magnetic field, the former 
in e.s.u. The four-dimensional divergence of (14) is obviously 0. 


The other pair 


divfT = 0, curll?+-^=0, 

C vt 


show that H is the curl of a three-vector A, and that then 

ia a 

c dt 


B = -i^-grad0, 


(15) 


(16) 


where <p is a scalar. A is arbitrary for given H to the extent of the gradient of any scalar, 
the effect of which on E could be compensated by a suitable change in <j>. Hence we can 
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impose another condition on A; we suppose that (A, i<j>) = A a is a four-vector. The 
condition that dAJdx a shall be scalar is satisfied if 


Then 


divA + -^ = 0. 
c ct 

iw _ dA i • = dA t dA * 

1 ic dt cx t dx 4 8x 1 ’ 

tj _ d-^a 8A 2 
1 dx v dx v ’ 


(17) 

(18) 
(19) 


with symmetrical relations; and iE a and H a are six components of an antisymmetrical 
tensor. We write 


■ 3 Ap 8A a 

f °i> = W a ~W/ 

H x — / 2 3 , H 2 = /si, B 3 — / 12 , iE 'i = j a , iE a = f 12 , iE a = / 4S , 
and f afi , the field tensor, is 


0 

h 3 

-H 2 

-iE; 

-#3 

0 


— iE 2 


~B 1 

0 

-iE z 

iE-± 

iE 2 

iE% 

o , 


( 20 ) 

( 21 ) 

( 22 ) 


If we now write (j a ,icp) = s a , the pair of equations (12), (13) can be written 

The pair (15) can be written 


d Ai = 4:77 , 

dx„ c ' 


tygfi a . Q/jy _ q 

dx r dXp dx a 


(23) 

(24) 


If a, fi, y are all different these reduce to (15); if two of a, /?, y are equal they are identities. 

The Lorentz force and the generalized stress tensor. If k is the mechanical force per unit 
volume, 

(25) 


k = pE+^j\H, 


which may be written as the first three components of 

K = 

The fourth component defined by this is 


(26) 

(27) 


so that Jfc 4 is i/c times the work done by the field on the charge per unit volume per unit 
time. Using (23) we now have 

k --f ^ 

“ - 47 T Jafi 3x„ • 


( 28 ) 
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If we define a tensor T ay by 







T = 

■‘■ay 


favf 

(29) 

we find after some algebra that 

k<x — 

dT 

' J - L ay 

dx Y ’ 


(30) 

The tensor T has the form 






/ 

T n 

T n 

^13 

-iSJc\ 


/ 

T* 

^22 

^23 

-iSJc \ 

(31) 

l 

^31 

^32 

^33 

-iSJc r 

\- 

- iSJc 

-iSJc 

-iSJc 

u / 



where the 3x3 set in the top left corner is the Maxwell stress tensor, S is the Poynting 

c l 

vector — E a H, and u is the energy density — (E 2 +H 2 ). 


4‘16. Probabilities in chains. We consider a system capable of several different 
states. At any instant there is a probability x i that it is in a state denoted by suffix i. 
We consider the probabibty that it will be in state i at a later instant. Given that it 
is in state k at the first instant (i.e., if x k = 1) the probability is a ik . Then since it must 
be in some state at the second instant 

( 1 ) 

and the total probability, for any set of values x k , that it will be in state i at the second 
instant, is 

Vi = a ik x k- (2) 

Card shuffling is the most familiar instance. The x t will be the probabilities of the 52! 
possible orders of the cards at the start. The conditions of shuffling imply that for any 
order before a redistribution several different orders are possible after it; for a known 
order k at the first instant these would have probabilities a ik , which must add up to 1. 
The total probabibty of order i after redistribution is then given by (2), by the usual rules 
for combining probabilities. 

The same principle occurs in statistical mechanics. If the data are that the momenta 
and coordinates of a system are within specified finite (not zero) ranges at one instant, 
the subsequent motion is not exactly determinate even on classical mechanics, and 
motions differing considerably will occur according, say, to what pair of molecules 
encounter at the next collision. 

Such probabilities can occur in chains, since the processes can be repeated. If a second 
rearrangement is made and the probabibty of state i after it is z it we shab have 


z i - a i m y m , ( 3 ) 

and so on. The successive probabibties are obtained by multiphcation by the same matrix 
This suggests a general treatment. The equations 


are consistent if 


X9 i = 

8 a ik ~ XX ik || = 0. 


(4) 

( 5 ) 


II-2 
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Suppose that (5) has n roots Xp and denote the in the respective solutions by 6 ^. Sum¬ 
mations with regard to j must be made explicit. Suppose further that x i can be expressed 

in the form 2 Then 
i 

Vi = = 2 

k j j 

and the result of p applications of the operation (2) will be 2 Af ocjd^, as for the class of 

dynamical problems described in 4*061. If in (5) we add the elements of each column 
we get n sums all equal to 1 — A, by (1). Hence A = 1 is always a root. 

It follows that for any set a ik there is a possible set of values of the x t such that y i = z t . 
They need not all be equal, since we have not assumed 

( 6 ) 

k 

But this condition is often simply a matter of our choice of what distributions are taken 

together and what are considered as alternatives. In the card-shuffling problem each a ik 

is 1/52!, both i and k have 52! possible values, and the condition is satisfied. But if we 

treated all separately, except that we lumped together those where the ace of spades is 

later in the order than the ace of hearts, all 2 cCik would still be 1, but all 2 a %k would not, 

i k 

since for a given i, h ik would be systematically greater if i corresponds to one of the com- 
bined alternatives than to one of the unaltered ones. Similarly, in the problem of collisions 
between molecules of a gas 2 a ik will be systematically greater if i refers to a region of 

k 

larger volume than for a smaller one. But this can be remedied by taking all the regions 
of equal size. Thus in actual problems (6) will often be satisfied, and if not we can usually 
choose the alternatives in such a way that it will become satisfied. 

We shall therefore assum e (6), which makes a considerable simplification in the analysis. 
We return to (2). Let the greatest and least of the x t be M and m. We assume that the x t 
are not all equal, so that M > m. We assume also that no a ik = 0. Then, using (6), 


Vi~M = y La ik (x k -M). (7) 

k 

Terms with x k = M contribute nothing to this sum and no term is positive. Of the others 
there is at least one with x k = m, and therefore if a is the least of the a ik , and therefore 
^ £ if there are at least two states, 

yi -M<a(m-M). (8) 

Similarly (9) 

If then M' and m! are any two of the y it 

M' <M — a(M — to), (10) 

m' >m + a(M — m), (11) 

| M' — to' | < (M — ra) (1 — 2a). (12) 


Thus the extreme range of the variables is multiplied at each step by a positive factor 
less than 1 - 2a, and must therefore tend to zero with a sufficient number of trials, what- 
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ever the initial probabilities of the possible distributions may be. Thus the probabilities 
of all the distributions tend to equality, provided only that none of the a ik is zero. The 
theorem can be extended even if some of the a ik are zero. It might be impossible, for 
instance, to pass directly from state 1 to state 2, but possible to pass from state 1 to state 3 
and back to state 2. The analysis applies equally if we apply it to the result of taking r 
steps, a ik now being the probability that the system would reach state i in r steps, given 
that it was in state k initially. Evidently it will be possible for all a ik to be positive in this 
case when some of them are zero for one step. Hence, provided that for some finite number 
r of steps, and for any state k, there is a non-zero probability for every i that state i will 
be reached, the probabilities of all states will tend to equality whatever the ini tial 
probabilities. 

It might happen that jM'-m'j in (12) was 0 for all pairs. In that case uniform prob¬ 
ability distribution would be reached in one step. 

An alternative proof, using complex variable theory, is due to Fr^chet. For a given A, 
let the 6 i} of largest modulus be such that | $ v \ = R. Then in the Argand diagram* the 
6 kj are a set of points within or on a circle of radius R about the origin, and whatever the 
a ik may be, subject to their not being negative and to their sum b eing 1, a ik Q k] cannot lie 
outside this circle. Therefore | A | R < R and all 


|A,|*S1- 

Again for each i, (A i - a u ) d ti = £ a ik 6 kj 

fc+i 

f ■ 

(13) 

| Ay — a u 11 0 tj | < R 2 % = R(l -a u ). 
But for one 0 i} , \6 i} \ = R; hence for this i, 


(14) 



(15) 


If then a u > 0, A, lies within or on a circle with centre a u touching the unit circle at + 1. 
Further such a circle centred on the smallest a u will include all the A,. Hence if all a ik with 
i = k are different from 0 it will be impossible for any A other than 1 to have modulus 1. But 
this condition says simply that whatever state the system is in it is not certain to move out 
of that state; and the conclusion is that whenp, the number of steps taken, is large enough 
A» tends to zero unless A = 1. Hence if none of a u ,a 22 , ...,a nn is zero the probabilities of 
the states will all tend to definite limits given by the solutions of (4) with A = 1. It is 
possible to have more than one such solution. In fact we could have all the diagonal 
elements 1 and then y i = x t for all i and the probabilities never change. A necessary 
and sufficient condition that there shall be only one solution with A = 1 is that the 
matrix a ik — 8 ik shall be of rank n— 1. 

Up to a point we can consider all possible solutions with | A j = 1 together. In (4) let 
@1 be the 0 i with the largest modulus R. Then if any 0 k such that a lk + 0 is such that | 0,. [ < R 

B = I e x I = I «i*A | < ^a ik R = R, 

which is a contradiction. The argument can then be extended to » = 2 if | 0 % | = | d x |, 
and so on. Then if all the a lk are different from 0 the only solution with | A | = 1 is 

d 1 = 0 2 =... = e n , A = 1 , 


*Cf. 11-04. 
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and therefore we have another proof that the only possible ultimate steady values of the 
probabilities are equal. The argument can again be extended to the case where some of 
the a ik are zero; it can only stop if after all | 6 i | up to | 6 m | (m < ft) it is found that all a ik 
vanish for k>m and But this means that it is impossible for any state with k>m 

to pass to one with k < m. In this case it is easily seen by using the relations (1) and (6) 
again that all a ik will also vanish if k^m and i > m, and the converse process is also 
impossible. Then there will be an infinite number of possible limiting states, depending 
on the total initial probabilities in the various independent sets. Thus the case of a multiple 
root at A = 1 corresponds to the case where the probabilities fall into two or more inde¬ 
pendent sets: and the limiting probability will be the same for each state of the same set. 

This shows incidentally that we cannot always make (6) true. For we could certainly 
have a set of a ik such that probability could pass from the states with i < m to those with 
i>m but not conversely. It is clear that in this case the whole probability tends to become 
concentrated in the latter set. 

We have already seen that the existence of a solution with | A | = 1 but A +1 requires 
that some a iki with i = fc, is zero, and further, in order to make, for | 6 k j ^ | 6 X |, 

\ a lk^k\ = \^l\y ( 16 ) 

there must be some d k , say K such that | 0 2 1 — | 0 X |, and all 6 k such that a 1Jc #= 0 must be 
equal to 0 2 . Since the a ik are real, a n = 0 (as we know already) and d 2 = A^. 

Now we can take 0 2 in place of 0 1 and infer that there is a 6 Z equal to A0 2 , and that for 
all k such that 6 k + 6 Z , a 2k = 0. But since only n values of k are available the process must 
close in a number of steps m, where m ^ n, and it follows, since all the 6 k found are different, 
that in every row of a ik all elements are 0 except one, which must therefore be 1. Thus the 
matrix a ik contains a minor of the form, for m = 4, 

( 0 1 0 
0 0 1 
0 0 0 
10 0 

and the equation || a ik — A S ik [| = 0 is satisfied if 

1—A m = 0, (17) 

the roots of which are exp (2r7ri/ra), where r is any integer from 0 to m— 1. The form of 
a ik shows directly that if the system starts in any state of these m, it will necessarily pro¬ 
ceed to the others in a definite order and return to the original one in m steps. 

If m = n, the system must describe the whole set of possible states. For m<n the states 
not included in the cycle are independent of those in the cycle and may form other cycles 
or have | A | < 1 for all roots not equal to 1. 

The problem therefore resolves itself into several cases. 

(1) If the a ik are such that, no matter what the initial state, there is a non-zero prob¬ 
ability that any other state will be reached in some given finite number of steps, then 
the probabilities of all states tend to become equal when the number of redistributions 
is made large. This is the ergodic theorem. 

(2) If the a ik are such that the states form sets, each member of any one of which is a 
possible successor of any other of that set but not of one of any other set, then the cha- 
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racteristic equation has 1 as a multiple root and the probabilities within each set tend to 
become equal; but their limiting values are not independent of the initial values. 

(3) In the above cases the roots of the characteristic equation are all either 1 or have 
modulus less than 1. If there is a root with modulus 1 but not equal to 1, some of the states 
will form a cycle such that each one has a determinate successor, and the original state 
will be reattained after a number of steps not greater than n. 

Case 1 is the one that arises in ordinary card shuffling and in the kinetic theory of gases. 
(The arguments usually given, due to Boltzmann and Gibbs, are fallacious.) Case 2 could 
arise in card shuffling if the pack was divided into two halves and these shuffled separately 
and finally placed together. Obviously many orders attainable by shuffling the complete 
pack become impossible. But the probability that the aces of spades and hearts will be 
together will never become independent of the probability that they were in the same 
half of the pack to start with. Case 3, in card shuffling, would describe a case where the 
* shuffle ’ consisted of always removing one card from the top and putting it at the bottom 
—which will never give more than a cut and never a true shuffle. But it also connects any 
deterministic mechanics with chain probabilities, since it shows that a necessary and 
sufficient condition that later states shall be exactly specifiable given the state at one 
instant is that the time factors in the solutions shall have modulus 1.* 

4*17. Integral equations. These are of several types, the common feature being the 
occurrence of an unknown function under the integral sign. They have a considerable 
literature, but can only be considered briefly in the present book. Three related types 
are as follows: 


J o K(*,y)<f>(y)dy =fix), 
4>i*)+ J o K ( x < y) 4>{y) dy = fix), 

x 

K(x,y)<}>(y)dy = \f>(x). 


/, 


(1) 

( 2 ) 
(3) 


Here the limits are fixed; K(x,y) and f(x) are known functions, and 0 is a function to be 
found. These equations can be considered as limiting cases of matrix equations. For if 
we take points of subdivision at y/a = 0, 1 fn, 2 \n,...,(n— 1 )jn, we can put 

f(x { ) = X it <j>(y k ) = Y k , 


K(x it y k ) = K ik . 

Then the equations are the limits of 

(4) 

l i Vk = x<, 

n k =o 

(5) 

Y i+ liK ik Y k = X { , 

n k= 0 

(6) 

l S K ik Y k = AT* 

n h — 0 

(7) 


* For applications to statistical mechanics and to quantum theory, cf. Proc. Roy Soc A 160 
1937, 337-47; Phil. Mag. (7) 33, 1942, 815-31. ’ 
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when oo. These are useful for numerical solution. With more accurate integration 
formulae they can be solved directly for any number of intervals up to 12 on Mallock’s 
machine for solving simultaneous equations. But they also show that considerable 
similarities are to be expected between these types of integral equation and algebraic 
linear equations. In particular, (7) will in general be soluble only for a certain set of 
discrete values of A, n in number, and therefore in the limit there will be an infinite number 
of solutions. There will be complications in the solutions of (5) and (6) also in cases where 
the determinant formed by the coefficients of the Y k vanishes. 

The function K(x f y) is called the kernel of the equation. The condition K iJc — E ki 
for a symmetrical matrix corresponds to 

K(y,x) = K(x,y). (8) 

If this is true the kernel is said to be symmetrical. We can define a hermitian kernel by 
the condition 

K{x,y) = K*{y,x). (9) 

Similarly, we can have an antisymmetrical kernel defined by 

E{x,y) = —K(y, x). (10) 

Analogues of orthogonal matrices exist. In (1), a solution may be 

<f>(y) = J o K(y, x)f(x) dz, (11) 

which is easily seen to correspond to the solution of a set of simultaneous equations by 
multiplication by the reciprocal matrix when ad = 1. 

Integral equations can seldom be solved in finite terms; but there are extensive theories 
of their solution by successive approximation.* 


EXAMPLES 

1. By considering the matrices ^ j, show that two symmetrical matrices do not 

necessarily commute. 

/cos 6 —sin#\ / 1 — tan/ 1 tan _1 . 

2. Prove that ^ cos(? ) = ( tan ^ j. ) 1 ) 

3. One of the quadratic forms 

Bx 2 4- 2 y 2 + 5 z 2 4- 2 yz — 2 zx, x 2 4- 2y 2 + 8 yz 4-1 2zx +12 xy 

is positive definite. Determine which, and find a real linear transformation that reduces them to the 
forms i 2 4* y 2 + £ a , 4£ 2 - 2 rj 2 - £ 2 . (Prelim. 1943.) 

* E. Schmidt, Math. Ann . 63,1907, 433-76; F. Smithies, Proc. Lond. Math. Soc. 43, 1937, 255-79; 
46, 1940, 409-66; Duke Math. J. 8, 1941, 107-33. H. Buckner, Die praktische Behandlung van 
Integralgleichungen, Springer, 1952. 
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4. Find a real non-singular linear transformation that will reduce the form 


to the standard form + 2 / 2 + •••+2/?“2/?+i*”*••"* 

where P = a;? +...+a£, Q = 

i<j 

Determine for all real A the rank r = p + q and the signature 8 = p — q of the forms P + 2A Q. 

(M.T. 1935.) 

5. Two uniform bars AB, BC are freely pivoted at B, and supported in a straight line by springs 

of stiffness A at A, B and C. The mass of each bar being m, show that the periods of the normal modes 
of vibration are given by my 2 /A = 3, 3 ± *J3. (I.C. 1940.) 

6. A long chain of rods each of mass 3m and length 2a, pivoted freely at the joints, lies in a straight 
line, and at each joint there is a spring producing a restoring force equal to — mA? times the transverse 
displacement. One end of the chain is acted on by a force of period 27r/A, and the other is fixed. Show 
that if Aq/^ 3 < A < A 0 all rods will be equally agitated, but that for other values of A the motion is 
confined to the neighbourhood of the exciting end. (This is a mechanical analogue of a radio frequency 
filter.) 

7. A light string AB of length l has n — 1 particles P l9 P 2 ,..., P n _i of mass m/n attached to it, so 
that AP X = PjPj = ... = P n _ Y B. A and B are fixed, and the whole is under tension P. 

Show that in a normal mode of period 27r/y the amplitudes a f of small transverse motions of the 
particles are connected by the relation 

a i+1 — 2 cos oca,; + a y _ 1 = 0 

where cos a = 1 — y 2 mZ/2n 2 T; find the normal frequencies. 

By taking the limit as n oo, obtain the normal frequencies of a uniform heavy string with the ends 
fixed. 


8. Illustrate Rayleigh’s method by considering the vibration of a uniform rod of length l and flexural 
rigidity EI 9 clamped at both ends which are free from end-thrust. Show that Rayleigh’s method gives 

the fundamental frequency v as ^(504 El/pi*) if the approximate form of the rod at any instant is 

27 T 

taken to be given by y = f(x) 9 where f(x) is the simplest polynomial satisfying the boundary con¬ 
ditions. (M/c, III, 1936.) 

9. If a, b are hermitian, show that ab + ba and i(ab — ba) are hermitian. 

10. Prove that if H is a hermitian matrix, the matrix U = (H -f it )~ 1 (H—il) exists and is unitary; 
and that if U is a unitary matrix, H = i( U+ 1) ( U— 1 ) -1 exists and is hermitian provided that U has 
not a characteristic value 1. 

Hence show that any characteristic solution of U is one of H and conversely, and hence that there 
is a unitary matrix I such that VUl is diagonal. Extend the last result to the case where U has a 
characteristic value 1. 

11. Show that any 2x2 matrix that anticommutes with Pauli’s (Tj is of the form atr 2 -f6cr 3 . 
Hence show that there are 8 linearly independent 4x4 matrices that anticommute with 
Eddington’s E x . 

12. If a quaternion is defined by the rule 

u = 1 u 0 + +ju 2 + ku 3y 

where = = ij = k, ji = -fc, 

and « 0 Ul u s u 3 are real numbers, show that for any quaternions u, w there exists a quaternion v such that 

uv = w, 

provided that u B , %, «*, w» are not all zero. 

Show that the Eddington matrices E lt , E„, E tl satisfy the conditions for i,j, fc. 
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13. A triangular matrix T is defined as one such that T ik = 0 if k>i, or one such that T ih = 0 
if k<i. Show that a triangular matrix with all its diagonal elements different from zero has an 
inverse, and relate this result to the method of solving a set of linear simultaneous equations 
a ik x k = b £ by successive elimination. 

Show also that if TT* = T + T, then T is diagonal. 

14. Show that any transformation of the form B = PAQ, where P 9 Q are non-singular, makes 
the ranks of A, B, equal. 

15. Show that if H is hermitian VHl is hermitian, and that if A is antihermitian VAl is anti- 
hermitian. 

16. A is n x n and has n different latent roots, and B commutes with A . Prove that B can be 
expressed as a polynomial in A of degree n — 1. 

17. l~ x Al is diagonal, and all its diagonal elements have modulus 1; and l is unitary. Prove 
that A is unitary. 




Chapter 5 

MULTIPLE INTEGRALS 


4 One by one, or all at once.’ 

' w. s. gilbert. The Yeomen of the Guard 


5*01. Distance; neighbourhoods; curves; regions. In this chapter we shall be 
concerned with functions of two or more variables, say x , y, z. These will often be rectangular 
coordinates of a point in the ordinary sense, but not always. We shall generally take them 
to be two or three in number, but extensions to more will be obvious. Some simple ideas 
from Euclidean geometry can be extended usefully to the cases where the variables are 
not rectangular coordinates. One of the chief of these is distance . The distance r between 
two points given by rectangular coordinates (x,y,z) (x\y',z 4 ) is the non-negative value 
that satisfies r 2 = (x'-x)*+{y'-yf + {z’-zf. (1) 

If the variables are not all of the same dimensions this will be meaningless, but we can 
extend the definition by taking 

r* = a(x'-x)*+/J{y'-y)*+y(z'-z)\ (2) 

where a, yff, y are positive constants chosen to make addition possible; and then by a change 
of variables we can reduce this again to the form (1). The distance so defined will not in 
general be the Euclidean distance; for instanoe, x , y, z may be spherical polar coordinates, 
and r will then be quite different from the distance expressed in terms of these coordinates. 
Often x , y, z will be numbers. The reason for introducing it is that if x'^>x, y'-±y, z'-+z, 
then r->0, and conversely, so that the statement that two sets of values of the variables 
are near together can be adequately summarized by the single statement that the distance 
is small. 

Note that if P, Q> R are three points with coordinates x i9 y { , z i (i not necessarily running 
from 1 to 3) - ** - (y t - xd - (z, - x t ), (3) 

fQ R = r %Q+rpR “ 2S iVi ~ x i) ( z i ~ x i)- ( 4 ) 

But by Cauchy’s inequality 

(Vi — x i) ( z i ^ ~ ^ i z i ~~ x i ) 2 = ) 

Hence (r PQ - r PR) 2 ^ r QR < ( r PQ + r PR) 2 ’ ( 6 ) 

I r PQ - r PB I < r QR < r PQ + r PB . (7) 


This is an extension of a familiar inequality in Euclidean geometry; but it is here to be 
understood as a purely analytical theorem. By induction, if P r (x H ) are a set of points, the 
distance from P t to P n is < the sum of the distances from P± to P 2 , P 2 to P 3 ..., P n _ x to P n . 

We shall use geometrical language freely; in particular, we shall say that a given set of 
values of the variables specifies a point We need an analogue for more than one variable 
of the notion of an interval of one variable. This contemplates (1) that the values of the 
variable, if necessary after multiplying by a constant dimensional factor, are real numbers, 
and uses the facts: (2) between any two real numbers there is another; (3) any limit- 
point of a set of real numbers is also a real number. Further (4) if x l9 x 2 are points of 
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an interval, any value between x x and x 2 is also a point of the interval—intervals have no 
gaps in them; and (5) an interval, except that of the whole set of finite real numbers, has 
at least one end-point, which may be characterized by the fact that within any distance of 
it, however small, there are some real numbers that belong to the interval and some that 
do not. An end-point may belong to the interval or not. (2) and (4) require only a notion 
of an order; we could conceivably say x 2 > x x without saying how much it is greater. But 
(3) and (5) do require the idea of a distance, which'we express by | x 2 — x x |. 

The generalization of an interval of one variable will be called a region for two or more. 
We are not yet ready to define it, but we have seen that distance for one variable plays an 
essential part, and that distance is easily defined for any finite number of variables. It 
enables us to give an immediate definition of a limit-point in m dimensions in terms of the 
notion of a neighbourhood . A neighbourhood of a point P specified by coordinates (z, y, z) in 
three dimensions is the set of points Q(x', y', z') such that the distance PQ , defined as the 
non-negative value of r satisfying 

r 2 = ( x ' - x ) 2 + (y' - y ) 2 + (z' - z) 2 , 

is less than some positive quantity 8 . (If we exclude the point P itself we say so explicitly.) 
If R is any set of points , P is a limit point of the set R if for any 8 there is a point of R, not 
identical with P, whose distance from P is less than 8 ; or more briefly, if every neighbourhood 
of P contains at least one point of R other than P, and therefore an infinite number as in one 
dimension . A set is closed if all limit points of subsets of it are members of it. 

For more than one variable, we might require that, in general, if all but one are kept 
fixed, the other should be capable of values over an interval and no others. But this leads 
at once to a difficulty. In two dimensions, if y is fixed, it would say that the values of x 
must lie in one interval; thus we should not be able to call the interior of a polygon with 
a re-entrant angle a region. Thus the extension of (4) to more than one dimension requires 
some modification of the obvious procedure of varying only one variable at a time. We 
say instead that if P, Q are within a region they can be connected by a curve lying wholly 
within the region. This rule, however, makes it necessary also to provide a definition of 
a curve. 

The notion of end-points is replaced by that of boundary points; that of interior points 
presents no difficulty. 

Since we shall provide an analytical definition of a curve, we might attempt to provide 
one of a region by saying that it is a set of points specified by an inequality, just as an 
interval may be specified by a < x < 6. In fact, regions are often specified in this way, but 
a precaution is needed. Suppose that we took the inequality | a? | x | a? — 2\ + \y\<\. This 
specifies two patches about (0,0) and (2,0), with no point in common. The inequality 
| sin x [ -f | y \ < | specifies an infinite set of patches about the points x — nn, y = 0, no 
two of which have a point in common. A single inequality therefore does not necessarily 
provide a satisfactory specification of a region. 

5*011. Curves. We have to translate into mathematical language what we mean by 
saying that a set of points forms a curve. We shall state the argument for three dimensions, 
but it applies for any finite number. The first essential is that the points occur in an order 
as we proceed along the curve. The second is that there are no gaps in the curve—we can 
travel between any two points of the curve without ever leaving the curve. We can express 
these conditions by saying that the variables x , y, z, for points on the curve, are continuous 
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functions of a parameter Jinan interval of t. (If the interval of t is finite and closed, x, y, z 
will be bounded.) The order of increasing t specifies the order of points on the curve; and 
the condition that x 9 y 9 z are continuous functions of t and all values of t in the interval are 
admissible ensures the absence of gaps. It may appear that the assumption that x, y 9 z 
are continuous functions of a continuous parameter t is not obviously true for all sets that 
we may wish to call curves. But the definition gives more generality than we need, not 
less. Curves have been defined satisfying this condition, with t bounded, that pass through 
every point of the closed region bounded by a unit square.* The possibility of obtaining 
useful results about curves arises from the further restriction, which we shall come to in 
a moment, that they have lengths; and if they have, the length along the curve from a 
fixed point to the point considered will serve as the parameter t. 

A curve will have a multiple point if there are unequal values t = c, t = d, such that 
(x, y, z) t=c = (x, y y z) t=d . Curves with no multiple points are called simple. 

If x, y, z are bounded on the curve and take the same values for both extreme values of 
t , the curve is closed. A simple curve that is not closed is called an arc. (The meaning of 
closed as applied to curves is quite unrelated to its meaning as applied to intervals, and as 
we shall apply it to regions.) A closed curve in two dimensions is usually called a contour. 

5*012. The length of a curve is defined as follows. Take points P r (x r9 y r9 z r ) on the curve, 
in order of increasing t r , where A is (x 0 , y 09 z 0 ) and B is 

(X, T,Z) = (x n ,y n ,z n ), 
define h r as the positive value satisfying 

K = (*r+l - X r? + (Vr+1 ~ Vrf + (**+1 “ z r)*> 
n—1 

and let s n = 2 K- 

r=0 

As n increases indefinitely, the largest interval of t tending to zero, s n may tend to a limit. 
If it does, and if the limit is independent of the way of choosing the t r at each stage, the 
limi t is called the length of the curve between A and B. An equivalent statement is that 
the length is the upper bound (if one exists) of s n for all ways of choosing n and the t r \ 
and evidently it is unaltered if t is replaced by any parameter t' that is a monotonic 
function of t. We denote the length of the curve between A and an arbitrary point P by s . 
A given value of s defines a point on the curve, and s, if it exists, can be used as a parameter 
in place of t. Curves with lengths are called rectifiable. 


5*013. A necessary and sufficient condition for a curve to have a finite length is that x , y 9 z 
all have bounded variation on the curve. First assume that the curve has a finite length. 

^ h r ^ | x r+1 — x r |, s ^ s n > S | x f _i — x r | . 


Hence for any choice of the points i^.,2 | x r+1 — x r | is not greater than the length of the 
curve; hence x has bounded variation on the curve. Similarly, y and z have bounded 
variation on the curve. Conversely, let V x , V yy V 09 the total variations of x, y, z on the 
curve, all be less than M. Then for any subdivision 

S^ r <S{| X r+X -X T | + \y r+ l-y r I + I Zrrt- Z r |} < 3Jf, 


* First by Peano: cf. Hobson, Functions of a Beal Variable , Camb. Univ. Press, 1907, 330. 
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and therefor© S h r is bounded above. But adding new points of subdivision cannot reduco 
2A r ; hence for any process of subdivision SA f tends to a limit. Uniqueness of the limit is 
proved as for the Riemann integral. 

5*014. Hence we are led to the following definitions. An interior point of a set of points 
in m dimensions is a point P of the set such that for some 8, every point within distance 8 
of P is a member of the set. An exterior point Q is one not of the set such that for some 8, 
no point within distance & of Q is a member of the set. A boundary point R is a point (either 
of the set or not) such that however we choose 8, within distance 8 of R there is at least one 
point that is a member of the set and at least one that is not. A region is a set such that 
every pair of points of the set can be joined by a curve all of whose points, except possibly 
the end points, are interior points of the set. A region is closed if all its boundary points 
are members of it, open if all its points are interior points. It is bounded if for all pairs P, Q 
of points of it, the distances PQ have an upper bound. This upper bound is called the 
diameter. Clearly in a bounded region the values of all coordinates are bounded. As for 
intervals, if we refer to a region as closed we imply that it is bounded. 

We take the following examples, mostly in two dimensions. 

(1) The circle x 2 + y 2 = 1 is not a region. For if we take any circle about a point of it, it 
contains points that are on the circle and points that are not. Hence all points of the set 
are boundary points; and if we pass from one point to another along the circle, we never 
pass through an interior point at all. 

(2) The points where x 2 + y 2 < 1 form an open region. The boundary is the circle 

x 2 + y 2 = 1. 

About any point within the circle we can draw a circle with sufficiently small radius for 
all points within it to satisfy x 2 +y 2 < 1; hence all points are interior points and the region 
is open. The diameter is 2. 

(3) The points where x 2 + s/ 2 < 1 form a bounded closed region. 

(4) The points where O^x^a, O^y^b form a bounded closed region of diameter 

(a 2 + 6 2 )*. 

(5) The points where O^x^a, 0<y<b form a bounded region, which is neither open 
nor closed, since part of the boundary belongs to the region and part does not. 

(6) The set of all points in a plane is an unbounded open region. 

(7) The set of all points in the plane, excluding points on the line from (— 1,0) to (1,0) 
is an unbounded open region. The line is the boundary—in this case an internal one. 

(8) The set of points such that 0 < x 2 + y 2 < 1 is a bounded open region; the origin and the 
circle together constitute the boundary. 

(9) The set of all points interior to two circles of radius \ and centres ( — 1,0) and (1,0) 
is not a single region, because if we take a pair of points, one within each circle, there is 
no curve that connects them without passing outside both circles. 

(10) Even the set of all points within and on two circles of radius 1 and centres ( — 1,0) 
and (1,0) is not a single region. For though we can pass from any point of it to any other 
without leaving the set, we cannot pass from (— 1,0) to (1,0) in the set without passing 
through (0,0), which is not an interior point but a boundary point. 

(11) The set of points defined by 1 ^ x 2 + y 2 ^ 4 is a closed region, but has the peculiarity 
that we can pass from any point of it to any other by paths of two distinct sets (clockwise 
and counterclockwise) while keeping within the region, and a path of one set cannot be 
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deformed into one of the other without passing outside the region. Such regions are called 
multiply connected , andplay aprominentpart in the theory of functions of a complex variable. 

(12) On the other hand, in three dimensions, if we take the region given by 

l<# 2 + y 2 + z 2 <4, 

though it has the same property that we may describe as having a hole in it, any path 
connecting two points can be continuously deformed into any other. Thus there is a quite 
fundamental difference between regions in two and more dimensions. 

In two dimensions any simple closed curve divides the plane into two regions, an interior 
and an exterior one; this looks obvious, but is in fact difficult to prove,* and we shall 
assume it. Conversely, the boundary of a region in two dimensions may be a simple closed 
curve, but may consist in part or wholly of arcs or isolated points, as in examples (7), (8). 
In m dimensions, where m> 2, a closed curve does not divide the space into parts; the 
boundary of a region is usually a region of m — 1 dimensions, but may be of any smaller 
number. If m = 3 the boundary is usually of two dimensions and is specified by taking 
the coordinates as functions of two parameters; such sets are called surfaces. 

By definition every point of a region and every boundary point are limit-points of 
points of the region, whether the region is open or closed or neither. Conversely, any 
limit-point of a set of points of a region is a point of the region or a boundary point (and 
therefore a point of the region if the region is closed). For if not, it would be an exterior 
point and therefore no point of the region would be within a certain non-zero distance 
from it. 

5*02. The Heine-Borel theorem in m dimensions. Let D be a bounded closed set of 
points; let F be a family of regions I such that every point P of D is an interior point of one 
of the family, say I P ; then a finite subset of F exists such that the same relations hold for 
the I of the subset. The proof is almost the same as in one dimension. We state it as 
for three dimensions. Since D is bounded it lies within a cube E of side L, with sides 
parallel to the axes. As before, if all points of D are interior to a single I there is nothing to 
prove. If not, divide E by planes parallel to all the axes into 8 cubes of side \L. Any cube 
that contains no points of D needs no further consideration. If for each cube that contains 
points of D these points are all interior to a single I the result holds; if not, again divide 
any cube that is not included in a single I into eight equal cubes. Then if the result is not 
established in a finite number of steps we have a nest of cubes, each part of the preceding 
one, with half the side, all containing at least one point of D, and none included in an I. 
This nest determines a limit point P Q , which must be a point of D since D is closed. But P 0 
is interior to an I , say I 0 ; and therefore there is a 8 such that all points within distance 8 
of P 0 lie in / 0 . But for some n the diameter of the cube of side 2- n L is less than 8 and there¬ 
fore this cube is contained in I 0 . Thus, contrary to supposition, there is an n such that the 
cube of the nest of side 2 - n L is wholly interior to an I. Hence the result follows. 

Note that the argument does not assume that D is a region. It might be the set of points 
(1/n, 0,0), where n is a positive integer, together with (0,0,0), which is the only limit point 
of the set. 

Note also that, though we have proceeded by dividing only cubes not yet covered by 
an I, the theorem follows equally if we divide all cubes at each stage; for if a single I 
includes a cube it also includes any of its parts. Thus there is no objection to taking all the 
cubes equal. 


* M. H. A. Newman, Topology of Plane Sets , 1939, 104. 
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As in one dimension we may relax the conditions slightly. If P is a boundary point of D 
it is enough that it should also be a boundary point of a closed I P . For about each boundary 
point we may take a sphere and define J P as the region consisting of all points belonging 
either to I P or to this sphere (or both); and for other points of D take J P identical with I P . 
Then the conditions hold for J P \ and each boundary point is a point of ip, though not 
necessarily an interior point. 

An immediate consequence, as for one variable, is that an infinite set of points in a bounded 
closed region has a limit-point belonging to the region . 

5*021. The modified Heine-Borel theorem. LetD be a bounded closed region . Let 
every point P of D be interior to a region I P of a family F. Then D can be divided into a finite 
set of regions O such that every G P is part of the I P associated with a point P common to G P 
and D. The same relaxation for boundary points of D can be made as for the main theorem . 

The proof is the same as for the Heine-Borel theorem. But now we cannot necessarily 
make all the cubes equal; for if a cube is covered by an I P corresponding to a point P 
interior to the cube, and is then divided, only one part can contain this P as an interior 
point; and then another part may not be covered by an I Q corresponding to any Q of itself. 

5*03. Functions of several variables. As for one variable, a function is defined for all 
points of a set. In particular we may consider its behaviour over a region. 

5*031. Continuity in more than one dimension. We say that a function f{P)=f(%,y), 
f(x, y, z), ... is continuous at a point P if for any e there is a neighbourhood of P such that for 
all points Q of this neighbourhood | f(Q) -f(P) \ < e. Clearly if / is continuous in this sense, 
it is a continuous function of any coordinate when the others are kept constant. The 
converse is not true. Take 

/<*-») -iTfP" 

except when x = y = 0, when/(0,0) = 0. This is a continuous function of x for any fixed 
y, and of y for any fixed x . But if x = r cos 6, y = r sin 6, 

f(x,y) = cos d sin 6 

which takes all values from — \ to \ irrespective of r. Hence in any neighbourhood of (0,0) 
there are points where f(x, y) differs from/(0,0) by more than J. 

A function is continuous in a region if for every P of the region and any e there is a S(P) 
(depending possibly on the position of P) such that for any Q of the region satisfying 

r PQ <S(P), \f(Q)-f(P)\<e. 

This does not assume that/is continuous at boundary points; the inequality stated might 
not hold if P is a boundary point and Q a point exterior to the region, but we are not 
interested in exterior points and therefore insert the words { of the region* in the second line. 

Since a neighbourhood is a region, we can infer at once from the Heine-Borel theorem 
that iff(P) is continuous in a bounded closed region , for any e there is a S such that if P and Q 
are any points of the region satisfying 

r PQ <3, then \f(Q)-f(P)\<e. 

The following theorems may be proved by similar methods to those used for one 
variable: 

A function continuous in a closed region is bounded in the region, and attains its upper and 
lower bounds . 
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Distance from a set 

Iff(P) is continuous in a region and there are points A, Bin the region where 

f(A)<0, f(B)> 0, 

then for any curve in the region connecting A , B there is a point P on the curve where f( P) = 0. 

5-032. We define a distance between a point P and a set of points S as the lower bound 
of the distances between P and points of the set. We define a distance between two sets 
similarly. 

The distance of a point Pfrom a set Sis a continuous function of the position of P. Denote 
the distance of P from the set by 8(P). Then for any w there is a point Q of S such that 

8(P)^PQ^8(P) + a>. 

Take another point P' and put PP' = r. Then 

PQ — r^P Q<PQ + r. 

Similarly there is a Q' such that 

8(P')^P'Q'^8(P') + (o 

and P'Q' < P'Q. 

Hence i(P') ^ P'Q' < P'Q < PQ+r ^ <S(P)+r+<w. 

Similarly 8(P) < S(P') +r + w; 

hence | d(P') - 8(P) | < 2r+2<o. 

Take « = Je; then for all r<\e | S(P’) — 8(P) | < e, 

and the result follows. 

It follows that if G is a curve all points of which are interior to a region, the distance of C 
from the boundary of the region is positive. For the distances of points of G from the boundary 
are a set of positive quantities, and their lower bound is positive or zero. But since the 
distance is a continuous function of position on C it takes its lower bound; the distance 
never being zero the lower bound is not zero. 

5-033. Any two interior points A, B of a region can be connected by a finite number of 
displacements, in each of which only one coordinate varies. We state the proof as for two 
dimensions. By definition of a region A, B can be connected by a curve G wholly interior 
to the region, and this curve has a positive distance 8 from the boundary. Also, since (x, y) 
on the curve are continuous functions of a parameter t, the distance of a point on the curve 
from a fixed point is a continuous function of t, and is therefore bounded. Take a square 
with sides parallel to the axes and large enough to include the whole of C, and subdivide 
it into squares of diameter less than 8. Of those intersected by the curve none meets the 
boundary of the region. The curve may intersect the boundary of a square many times, 
even an infinite number. But if t = 0 at A, the values of t such that the point on the curve 
specified by t is in the same square as A have an upper bound, which we may say defines 
the point P x where the curve last leaves the square containing A and enters another. Then 
for the square entered at P± take P%, the last point where the curve enters another, and so 
on. The number of squares is finite; hence the number crossed by the curve is finite, so that 
it is possible to proceed from 4 to 5 by a finite number of steps each within one square. 
For each square we can now proceed from P r to P r+1 by changing the coordinates in 
succession. 

We see incidentally that any two interior points of a region can be connected by a rectifiable 
curve. 


J if P 


12 
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A consequence is: If the 'partial derivatives of a function are zero everywhere in a region , 
the function is constant at all interior points. If x alone is varied in a displacement, and 
df/dx — 0 from x = x r to x r+1 , with y, z constant, it follows from 1*13 that f{x r , y) = f( x r +i>V)> 
and similarly for variations of y or z alone. Hence we can show in a finite number of steps 
that / has the same value, say a, at any two interior points. If in addition f(x 3 y) is 
continuous in a closed region and P is a boundary point, for any e there is an interior point 
Q such that \f(Q)— f(P) | <e, and therefore |/(P) — a | <e. Thus/(a?,t/) is also equal to a 
at boundary points if it is continuous in the closed region. 

This result is the basis of the extension to several dimensions of the method of finding 
an integral as the inverse of a derivative. It would not be true if we had defined a region 
in a way that permitted it to consist of several detached pieces. 


5*04. Differentiability in more than one dimension. The derivative of a function 
f(x) of one variable is , 

f(x) = lim j-{f(x + h)-f(x)}. (1) 

This may be written f{x + h) —f(x) + hf'(x) + o(h). (2) 

In several dimensions we want an approximation to the values of f(P) in a neighbourhood 
of P, with an error again that becomes arbitrarily small compared with the distance when 
the distance is small enough. The kind of approximation needed is therefore, for two 

variables, f{x+h,y + k)=f(x,y) + hf x + kf v + \(h 2 + fc 2 )*, (3) 


where f x , f y are independent of h, k, and A for given x,y is a function of h, k, tending to 0 
when h 2 + k i -> 0. Taking k — 0, h = 0 in turn we have 


St.,. St., 

dx Jx ’ dy Jv 


( 4 ) 


by definition of partial derivatives. Then we may write the condition as: f(x,y) is 
differentiable at (x, y) if , for any e, there is a 8 such that for h 2 + k 2 < 8 2 


\f(x + h y y + k) -/(*, y) - hf x -1cf y | < e(h 2 + *•)», (5) 

where f xi f y are independent of hand k. We shall see in a moment that this condition is more 
severe than the mere existence of the partial derivatives, less severe than their continuity. 

If for every P{x, y) of a region , a relation of the form (5) is satisfied for given e by the values 
of f at other points Q(x + h,y + k) of the region satisfying (h 2 + k 2 )* < 8, where 8 may now 
depend on P, f(x f y) is said to be differentiable in the region . 

The modifications for larger numbers of variables are obvious. 

This definition of differentiability of a function of several variables is due to Stolz and 
W. H. Young.* It is the basis of the theory of functions of a complex variable (cf. 
Chapter 11) and also has important physical applications, of which we have made use in 
Chapter 3, and which follow immediately from the definition. If a function is continuous in 
a region , a sufficient condition for its gradient at a point to be a vector is that the function shall 
be differentiable there. A sufficient condition for the gradient of a function to be a vector function 
in a region is that the function shall be differentiable at every point of the region. Similarly, 
if u i is a vector function in a region , a sufficient condition that dujdx k shall be a tensor of the 
second order is that the components Ui shall be differentiable . a 

* Proc. Lond. Math. Soc. (2)7, 1909, 157-80. 
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Xote that the partial derivatives of a function may exist everywhere without the function 
being differentiable. This is true, for instance, of the function 

m°) = o. 

Here at (0,0) we have dfjdx = df/dy = 0, but for small x, y 


; (/<*. S) -/(«. 0) - (°. o) ■- * | (0,0)) . 

which is unbounded in any neighbourhood of (0,0). 

x 2 y 


3/ 


Eren for the function f(x, y) = 

H.. 3/ 


x 2 + y ! 


we find 


/(*> y) -/(0,0) - * (0,0) - 2/ ^ (0,0) j = 


. /( 0 , 0 ) = 0 , 

3/ /a x% y 


x*y 


r(x 2 + $r 2 ) r 3 

( 2 2 \ 

— within any neighbourhood of the origin. 

5*041, If the partial derivatives of a function are continuous in a neighbourhood ofapoint , 
the function is differentiable at the point . For 

f(x + h, y + k) —f (x, y) = {f(x + h,y + k) —f (x + h } y)} + {f(x + h, y) —f(x, y)} 

-*$) +*(■£) 

\ 6 yJx+h,v+ek 'P X lx+ 4 ,h,y 

for some 0, <j> between 0 and 1, since the derivatives exist throughout the interval con 

sidered; and since they are continuous, for any positive w there is a 8 such that if 

h* + k 2 <8 2 , 

I®— 

and therefore, since h 2 + 6 2 k 2 < 8 2 , <f> 2 h 2 < 8 2 , 

f(x+h,y + k)-f(x,y)-h^j | < o> \ h | + <o \ k \< <o<j2<J(h*+k 2 ). (4) 

This is the simplest sufficient condition for differentiability. To show that it is not a 
necessary condition, take 

f(x,y) = x 2 sini; /(0,0) = 0. 

X 

df/dtc and df/dy exist everywhere and are zero at (0,0); and evidently 

;{/(*.»)-/( 0,0)} = ?- 2 sini 
t r x 

which tends to 0 with r. Hence the function is differentiable at (0,0). But for x +0 


( 1 ) 


( 2 ) 

(3) 


3/ 1 0 . 1 

— = — cos- + 2#sin-. 
OX x x 


which is not continuous in a neighbourhood of (0,0). 

5-042. Covering theorem, for differentiable functions. For variety we take the case of three 
variables. If f(x,y,z) is differentiable throughout a bounded dosed region D,for any e it is 


12-2 
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possible to superpose a finite set of cubes O, such that for each O r there is a point P r common to 
O r and D such that for any point belonging to O r and to D 

I /(*, y, *)-(*- x r )fx( p r) - (y - y r )M p r) - (* - z r )f z { p r ) I <e{(x- x r Y + (y-y r ?+( z - z r) 2 } i - 
An inequality of the form 5-04 (5) extended to three variables holds for a neighbourhood of 
every point of D, Hence, by the modified Heine-Borel theorem, we can superpose a finite 
set of cubes with the required property. The cubes will in general not be equal. 


5*05. Double integrals. We define first the double integral of a function f(x, y) over 
a rectangle O^x^A, O^y^B. We suppose the rectangle divided into rectangles of sides 
h r , k 8 all of whose diagonals are < 8; in each of these we take a specimen value of /(x, y), 
say /(§ r , i j 8 ) 9 and form the sum £E/(£ r , tj 8 ) h r k s . If this sum tends to a unique limit when 
8^0, this limit is the double integral 

I* f f(x,y)dxdy. (1) 

Jo Jo 


A necessary and sufficient condition for the existence of the double integral is that f(x, y) 
shall be bounded and that for any a), a the points where the leap of f(x, y)is ^ c*> can be enclosed 
in a finite set of rectangles whose total area is < a. The proof is a straightforward extension 
of that of the corresponding theorem for the Riemann integral. It is easy to show that 
a rectifiable curve can be enclosed in a finite set of rectangles of arbitrarily small area. 
Consequently discontinuities along a finite set of rectifiable curves do not affect the 
existence of the integral. 


A double integral, say jjg(x,y)dxdy 9 over S t the interior of a closed plane curve <7, 
can then be defined by taking a rectangle D enclosing C and defining/(x, y) = g(x, y) in S 
and f(x y y) = 0 outside 8. Then JJ g{x, y)dxdy is defined as JJ f(x, y)dxdy over the 
interior of D, provided/(x, y) satisfies the condition of integrability over D. 


5*051. Repeated integrals. A double integral, if it exists, is usually evaluated by 
integrating with respect to the variables in turn. For given x , J f(x, y)dy is a single 
integral and is a function of x. If this is then integrated with regard to x the result is the 
repeated integral J j J* f( x > V) % j dx l this is usually written J dx j f(x 9 y) dy, but other 
conventions are in use. Alternatively, we could integrate first with respect to x and then 
with respect to y, obtaining the repeated integral J dy J /(x, y) dx . 

If f f f(x,y)dxdy exists, both repeated integrals exist and are equal to the double 
Jo Jo 

integral. This needs to be understood in a rather extended sense. If for some x, say x = a, 

C B 

f(a, y) — 0 for irrational y and = 1 for rational y, I f(a, y) dy does not exist. But the line 


of integration can be enclosed in a rectangle of arbitrarily small area, and such irregularity 
for only one value of x will not affect the existence of the double integral. However, we saw 
in the discussion of the Riemann integral that whether the integral exists or not, so long 
as the function is bounded, the upper and lower integrals H, h exist. Then we define 
H(x), h(x) as the upper and lower integrals of f(x, y) with respect to y for given x, and form 
the sums 2 H (£ r ) h r > 2 *(&■) K ( x r ^ < x r+i)- 

r r 
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The existence of the double integral 5-05(1), which we shall denote by I, implies that 
for any e a 6 exists such that whenever the subdivision is into rectangles whose Hia. gnnn.la 
do not exceed S, T , , _ 


where x r+1 -x r = h r , y s+1 -y 8 = k s , and M rs , m r3 are the upper and lower bounds of f(x,y) 
in the rectangle indicated by the suffixes. Take a line of constant x(x r < a; < £ r+1 ). In any 
rectangle the upper and lower bounds of f(x, y) on this line lie in (m rs , M r3 ); hence 

S m rs k a < h(x) < H{x) ^’£JM TS k l 
8 8 

and if the largest k 8 tends to zero 

I — e ^ Hn r h r ^ S N r h r ^ I + e 

where by n r we mean the lower bound of h(x), by N r the upper bound of H{x) in each interval 
of a:. Now let the largest h r tend to zero, and take the upper bound 7 X of the first sum, the 
lower bound 7 a of the second for all such subdivisions. If 7 X and 7 2 are equal we say that the 
repeated integral exists. Now 

7 — e ^ 7 X < 7 2 < 7+ e. (5) 

Since I v 7 2 are independent of e, both must be equal to 7. Hence h(x), H(x) are both 
integrable with respect to x and their integrals are equal to the double integral. The 

repeated integral can therefore be interpreted as either J* h(x)dx or J* H{x)dx. The 

argument shows incidentally that if there are points where H(x)—h(x) ^ w > 0, these 
points can be enclosed in a finite set of intervals of x of arbitrarily small total length. 

The converse is not true; the repeated integrals may both exist and be equal without the 
double integral existing. But in practice such cases are rare and in any ordinary case 
direct examination of the function will decide quickly whether the condition for the 
existence of the double integral is satisfied. 

A double integral jjf(x,y)dxdy over an infinite region R can be defined by taking 

a sequence of regions {i? n } such that, for any part of R, this part is included in all R n for 
n greater than some m. If the double integral over R n has a unique limit for all such 
sequences, this limit can be taken as the definition of the integral over R. Improper 
double integrals may be defined similarly. It appears, however, that unless the same process 
gives a unique value when \f(x,y) | is substituted for f(x,y) the value of the limit will 
depend on the shapes of the regions R n , and consequently a non-absolutely convergent 
double integral has no meaning unless these are specified. For an absolutely convergent 
double integral inversions of limiting processes can be justified by the theorem of 1-111* 
Analogous statements are true for triple and n-ple integrals. With the definition of the 
area of a surface that we shall give, if a surface possesses a finite area it can be enclosed in 
a set of parallelepipeds of arbitrarily small total volume (5-07 a). 

y) = 0 outside a region other than a rectangle with sides parallel to the axes, the 
termini for y in the repeated integral, if a: is kept constant, will depend on x. Thus let us 
suppose that the region is a quadrant of a circle defined by x > 0, y > 0, x*+y* ^ o 2 . For 
given x, these inequalities require 

0 (a 2 — x 2 yi», 

Ch* 6° T fUrther details see Hobson, Theory of functions of areal variable, 1907, Ch. 5; 1920, Vol. 1, 


(3) 

(4) 
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and the range of integration for y is therefore 0 to (a 2 —x 2 f^. Then x can range from 0 
to a, and the integral over the quadrant is 

a rViat-x*) 

dx f(x, y) dy. 

o Jo 

fa rVia'-v*) 

Similarly, it can be written I dy I f(x, y) dx. 

The region of integration always specifies a set of inequalities to be satisfied by the 
independent variables. From these the limits for a repeated integral can be found either 
by direct transformation, as in the example just considered, or with the aid of a figure. 
The latter is often useful, particularly when the boundary consists of several curves 
with different equations. 

A repeated integral often arises directly, but is sometimes most easily evaluated by 
inverting the order of integration. When the limits are not constant, the easiest way of 
deciding the limits of the inverted integral is usually to find inequalities as for the double 
integral as an intermediate step. Thus consider 

I = f dx f f(y)dy. 

Jo Jo 

The inequalities indicated, if x and y are to be within the 
region of integration, are 

Q^y<x, 0 < a < 

and for given y, y^x^t, while y, if unrestricted by x , can 
range from 0 to t. Then 

I = f dy f f(y)dx = f (t-y)f(y)dy, 

Jo J j/ Jo 

so that one integration has been performed immediately. Geometrically, the integral is 
over the triangle whose comers are (0,0), (t, 0), (t, t); and if we integrate first with regard 
to x it must proceed from y to t to cover the variation possible for given y within the 
triangle. 

5*052. Change of variables. Take a double integral 

1 = jjf( x >y) dxd v 

which is or dinar ily evaluated by successive integration with respect to x and y. Let 
g and 7 } be two differentiable functions of x and y. The curves of £ constant and y constant 
will in general mark out the plane of x, y into four-cornered figures, which could be used 
as the elements of area in de fining the double integral. We have to express the elements 
of area in terms of g and y. x and y will now be regarded as functions of £, y. 

As we turn to the left in turning from the axis of positive x to that of positive y, we 
also take £ and y so that we turn to the left in changing from the direction of increasing £ 
to that of increasing y; but these directions will not in general be perpendicular. The 
simplest approach is to notice that as ££ and Sy tend to zero the element of area specified 
approximates to a parallelogram except possibly near special points, such as the centre 
of a circle if £, y are polar coordinates. To the first order in 5£, dy, if one corner is (x, y) the 
two adjacent ones are 

?+!«). 
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But the area of a parallelogram with three of its vertices at (; x x y *), (x 2 y 2 ), (# 32 / 3 ) is 

Vi 1 = *2“*i V2-V1 

*2 2/2 1 *3“*1 2^3 Vi 

*3 2/3 1 

and therefore the area of the parallelogram is 

dx dy MX* 

M ag w 

0£ 02/ 

09/ 09/ 

The determinant is called the Jacobian of x, y with respect to £, 97 and denoted by 
d(x,y)ld(£,y)- Then 



where f(x, y) must be expressed as a function of £ and 9/, and the range of integration is 
such that each element of area within the original region appears once and only once in 
the transformed integral. 

This is the simplest way of getting the answer, but has several disadvantages. It is diffi¬ 
cult to fix limits to the error in replacing the element bounded by curves of constant £ and 97 
by a parallelogram, and therefore to show that the total error tends to zero when all the 
ranges 8x, dy do. Also it is difficult to generalize, for though the argument in this form is 
easily extended to triple integrals, since we have a convenient form for the volume of a 
general parallelepiped, the extension to n-ple integrals is not obvious. We get, in fact, 

jjjf(x,y,z)dx d ydz = JJJ/fo 2 ) fjffjfj dr > d ^’ 

where 

dx dy dz 

H % M 

dx dy dz 
dy dy dy 

dx dy dz 

0£ 0£ 0£ 

But four variables of integration occur in general relativity, and in more than three 
dimensions it becomes difficult to see what we mean by any generalization of the volume 
even of a parallelepiped, except by adopting a purely analytic definition by an integral. 
Then the known properties of area and volume in Euclidean geometry can no longer be 
used to short-circuit the direct transformation of variables. The variables x, y, z may not 
be Cartesian coordinates; then it becomes difficult to see what we mean by the area or 
volume of a region. For all these reasons it is best to proceed by direct transformation 
of variables, one at a time. It is convenient to use a suffix notation and to take the case 
of a triple integral to illustrate the method, which can clearly be extended to any order. 
We start with 



d(x,y y z) 




I = Jjjf(x l3 x 2 ,x^)dx 1 dx 2 dx z> 


( 1 ) 
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and regard x v x 2 , x 3 as functions of £ lt £ 2 , £ 3 , with continuous first partial derivatives, such 
that there is a one-one correspondence between points of the x i} £ t regions. Then, by the 
mean value theorem, 

SXi = di } s ^’ 


( 2 ) 


where each dxjd^ is to be regarded as evaluated at some point (not necessarily all at the 
same point) inside the element bounded by two sets of values of the £$ differing by S£j. 
We first express x x as a function of £ v x 2 and x 3 . Then, keeping x 2 and x 3 constant, we have 
to solve the equations 


= 0-g^ 0 = ^, 


and the solution is 







8x x . 


(3) 

(4) 


When S£ x -> 0 the determinants tend to the Jacobians evaluatedat (x x x 2 x 3 ); and therefore 

m’/l®)"’- ,6> 

Now express * 2 as a function of £ 2 , and x 3 ; we find similarly, keeping £ v x 3 constant, 

(«) 


a 0^2 . ^2 ^ 


^b 2 ^£3 

3(^2-*3)^ _ 9 *3^ 

2 ’ 




Finally, express x 3 as a function of £ 3 , and then keeping and £ 2 constant 


(7) 

( 8 ) 

( 9 ) 


and I = l\222) d ^ d ^ 3 ' (10) 

The successive changes of variables are justified by 1*1032 provided the integrals exist, 
and the result (10) can be identified as the triple integral if the latter exists. In the con¬ 
ditions stated, J is continuous and bounded in any finite region, so that the triple 
integral necessarily exists. 

There is, however, a difficulty if J changes sign within the region of integration. Even 
for change of a single variable, say x to £, if dx/d£ changes sign within the range of integra¬ 
tion, two or more values of £ will correspond to a given x , and the integral must be written 
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in the Stieltjes manner if the correct answer is to be reached. In the transformation of 
repeated integrals, if the Jacobian vanishes at a point within the region, there is a 
neighbourhood of the point where a given set of values of x v x 2 , x z will correspond to more 
than one set £ 2 , £ 3 . If this occurs, the best procedure is usually to break up the region 
into subregions in each of which the Jacobian keeps a constant sign and to consider each 
separately. 

This method shows that the first method, based on treating elements of area or volume 
as parallelograms or parallelepipeds, actually gives the right answer, and being justified 
to this extent can be used directly in many cases where it is more convenient than working 
out the Jacobian explicitly. The transformation is often simplified by use of the theorem 

9(#1 ... x n ) __ 9(ff 1 ... x n ) • • • £n) /in 

8(*i.■■*•> "Vn)’ K 9 


for 


dx t ^ dx i 


where x i on the left is regarded as a function of the rj k and on the right as a function of the 
fj. But the determinant of the expressions on the right of (12) is the product of the two 
determinants on the right of (11), by the rule for multiplying determinants. Alternatively, 
the theorem is required by consistency; for if it was false we could get a different result 
by transforming an integral to variables rj k directly from what we should get by trans¬ 
forming first to i-j and then to rj k . 

There is a theorem that if the Jacobian of Zl.-.Sn with respect to x l9 ..., x n is zero 
everywhere, one of the g’s is determined when the others are given, and they are not a 
suitable system of coordinates. A proof will be found in most books on calculus. 


5*053. Changes of limits. Change of variables naturally implies changes of the limits. 
The new limits may be found either analytically, by writing the ranges in terms of 
inequalities, or by drawing a figure and finding the limits by examining the ranges of 
the new variables required to cover the latter. No general rule of transformation can be 
given, but the methods will be illustrated by examples. 

There is an apparent inconsistency between the Jacobian transformation and the 


simple reversal of order in an integral with regard to two variables, for ■ ■ = — 1. 

3(x u 2 ) 

This is explained as follows: > * is unaltered by any cyclic interchange of x, y, z 

V> o! 

or of g, rj, £. It is positive if, when the directions of dy, d£ are turned so that and drj 
lie in the plane of dx and dy , the rotation about dz from d£ to drj being positive, d£ makes 
an acute angle with positive dz. Thus the positive Jacobian means that drj , d£ form a 
right-handed set of directions, and Mx,y,z increase throughout their ranges of integration, 
f, y, £ will also increase; for each integration the lower limit is the smaller. If the Jacobian 
is negative <2£, dy, d£ form a left-handed set, but an odd number of them will decrease 
through the range. If we still make all the lower limits the smaller we must therefore also 
reverse the sign. Hence if the lower limit is made the smaller for every variable , the Jacobian 
must be replaced by its modulus. The formula for two variables is right because the limits 
are not reversed and therefore we must not use d(x, y)fd(y, x) but its modulus, which is -f 1. 


5*054. Polar coordinates. Take 

/ = J//(*, y) dxdy (0 <,jc 2 +y 2 ^ a 2 ). 
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and transform to polar coordinates r, 0. Then 

x = r cos0, y = r sin0, 


5-055-5-056 


d(*>y) 

d(r,0) 


cos0 — r sin 6 
sin Q r cos 6 
fff( x > y)rdrdd. 


= r, 


and I 

With the usual conventions we always take r > 0; then we represent every point within 
the circle once by taking 6 from 0 to 277, and the limits are 0 to a for r, 0 to 277 for 6 . It 
would be possible to allow negative values of r, so that ( — x, —y) would correspond to 
(~r y 6) with the same 6 as for (x y y) ; but then to represent every point once 6 can only 
range from 0 to 77, and the limits become — a to a for r, 0 to 77 for 6. Either system is equally 
valid. The origin is a singular point of the transformation, but gives no trouble. 

5-055. Evidently if we change from rectangular coordinates x, y, z to cylindrical 
coordinates m, A, z we get similarly 

d(x>y y z) 


= XU. 


d(xu, A, z) 

If we now change from cylindrical to spherical polar coordinates ( r , A, 0) we must put 

z = r cos 0, 7Z7 = r sin 0, 

d(xu, A, z) d(xu, z) 


and therefore 


d{x,y, z) 


3(r,A,0) 3 (r,0) 

= —rxu = — r 2 sin 6, 


= -r. 


3(r, 6 , A) 


= r 2 sin 6. 


3(r,A, 0) 

This can also be obtained directly by taking 

x = rsin0cosA, y = rsin0sinA, 3 = rcos0. 

We represent the whole of the interior of a sphere of radius a by letting r range from 0 
to a , 0 from 0 to 77 , and A from 0 to 277. But either of the following would be equivalent: 

— a^r^a. O<;0<77, 0<A^77, 

O^r^a, — 77^0^77, 0<A^77. 

We must not take the ranges as, for instance, 0-77<0<77, 0^ A^277, for this 
would cover the sphere twice over and give twice the correct result. 

5*056. As an illustration of several features in the treatment of double integrals let 
us take the method given in many text-books for evaluating the integral 


It proceeds as follows: 


7 -J, 

-n 


e~ x *dx. 


e~ x *dxxj e~ y2 dy 


e -(» 2 +y*) dxdy 


/*oo /•i/aw 

= e'^rdrdO 

Jo Jo 

/*CO /•1/27T 

= e~ r *rdrx dO 
Jo Jo 

= i • i”‘> 

I = 


( 1 ) 

( 2 ) 

(3) 

(4) 

(5) . 

( 6 ) 


therefore 
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The passage from (1) to (2) requires justification; it is not obvious that the product of 
two single integrals can be converted into a double integral. Actually I must be under¬ 
stood to mean rx r:r 

lim e~ x dx = lim e~ v dy , 

X->a> J 0 F->oo J 0 

and we can without loss of generality take X = Y. Then (1) reads 


(7) 


But in 


P — lim [ X er x *dxx[ X e-v*dy. 

X->oo Jo Jo 

P f dxdy 

Jo Jo 


( 8 ) 

(9) 


the limits are finite and the integrand continuous. Hence it can be evaluated as a repeated 
integral and is the same as the product of integrals in (8). Hence 

P = lim f f e-< x% + v *>dxdy. (10) 

X->oo J 0 Jo 

But this is not obviously transformed to (3) by change 
of variables; for the region of integration is a square, 
and (3) must be understood to mean 
f R rVsw 

lim e-^rdrdd, (11) 

12—>oo J 0 J 0 

and in this the region of integration is a quadrant. 

The justification is that the integrand is positive for 
all x, y y and the integral over the square must he 
between those over quadrants of radii X and X*J 2. 0 

That is, „ x „ 1{27r rxvv r 1 /** 

I er^rdrdB< e~^ x%+v ^dxdy < I e~ r rdrdd . (12) 

Jo Jo Jo Jo Jo Jo 

When we integrate with regard to 6 and then proceed to the limit, the first and third 
expressions tend to the same limit; hence 





er^rdr = in. 


By an obvious transformation we find the result, often wanted later,* 




x-V* er x dx = ^n. 


(13) 


(14) 


5*057* A multiple integral. Especially in the theory of probability the integral 

1 = JJ...Jexp(- iW)dx 1 ...dx n (1) 

is often wanted, where W = (2) 

is a positive definite form and the limits are — oo to oo for all variables. If W is not positive definite 
the integral does not converge. We know that W can be reduced to a sum of squares by a transforma¬ 
tion to variables &... £ n , such that the Jacobian of the transformation is unity (cf. 4-08). Then if 

TT = Ang+...+A ttn £, (3) 

I is the product of n integrals of the form 


P_' > exp(-iA fr g ) d£ r = 

* See Chapter 15 for the definition of the factorial function. 


( 4 ) 
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But 

Hence 


Integrals along curves 


*^ 11^22 • 


1 = 


•A nn — || a ik II* 

(27T)^ n 

VII a ik II 


5*06. Integrals along rectifiable curves. If 



5-06-5-07 

(5) 

w 


( 1 ) 


s increases continuously as we pass along the curve, and / is a Riemann integral. Note that 
if |/1 < M , and the length of the curve from A to B is L, \ 1 1 < ML. 

If u , v, w are three functions of x , y> z, bounded and continuous on the curve, let 


J — f (udx + vdy + wdz). (2) 

This is regarded as the sum of three Stieltjes integrals, since x, y, z are not necessarily 
monotonic functions of s. x is both continuous as a function of s and of bounded variation. 
But even if u is continuous as a function of s it may not be continuous as a function of a, 
for part of the curve may be perpendicular to the axis of x , say for x — a, and then u may 
change discontinuously as x passes through a because y or z does so. But it follows from 
(sufficient) conditions for the existence of the Stieltjes integral that J will exist if the curve 
can be divided into a finite number of arcs, on each of which u, v, w are either continuous 
or of bounded variation. 

It can be shown* that a rectifiable curve has a tangent almost everywhere; hence the 
derivatives 

l = dxjds , m = dyjdsy n — dzjds, (3) 

exist almost everywhere. By a change of variable we can write 


J = J (lu + mv-\-nw)ds = jl^ds = jl.uds. 


(4) 


Sufficient conditions for this change of variable are that both (2) and (4) shall exist and 
that the dxjds are bounded, as they obviously are. In practice u t and l t are usually 
continuous except possibly at a finite number of points, and the integrals will be seen to 
exist on inspection. 


5-07. Surface integrals: area of surfaces. Corresponding to simple closed curves 
and arcs are closed surfaces and surfaces with boundaries. A closed surface is one of finite 
diameter and with an inside and an outside, so that we can pass from any point of the 
surface to any other without leaving the surface, from any internal point to any other 
without crossing the surface, and similarly from one external point to another; but we 
cannot pass from an internal point to any external point without crossing the surface. 
If in addition any simple closed curve on the surface can be shrunk up to a point without 
leaving the surface, the surface is called simply connected. These conditions are satisfied 
by most ordinary surfaces, in particular by the boundaries of solids, but it is possible 
to construct an anomalous surface known as the Klein bottle , which has no inside and 
outside, and no bounding curve. It is of finite extent and we can pass from any point of 

* Most simply by A. S. Besicovitch, J. Lond . Math. Soc . 19, 1944, 205-7. 
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it to any other without leaving it, but any two points not on the surface can be connected 
by a path that does not cross it. Such surfaces are excluded from our discussion. Any 
straight line that intersects a closed surface at all will intersect it in an even number of 
points; we shall not consider surfaces such as, in cylindrical coordinates, 

xu = sin 2 -, 
z 

which is intersected an infinite number of times by any line A = 0 , m = a, where 0 < a < 1 . 
An example of a surface that is not simply connected is an anchor ring. 

Any simple closed curve on a closed simply connected surface divides it into two portions, 
which we shall call caps. The curve will be called the rim of either cap. Either cap separately 
has no inside, but usually has the further property that a simple closed curve on it and 
not meeting the boundary divides it into two regions separated by the curve, and any such 
interior curve can be contracted to a point without leaving the surface. That is, the cap 
must not have a hole in it. We exclude the Mobius strip because it is possible to draw 
closed curves on it that do not divide the strip into two mutually inaccessible regions. 
A Mobius strip can be made by taking a long rectangle of paper, giving it a half twist, 
and pasting the ends together. It clearly has a single boundary, and a complete longitudinal 
cut along it, following the original middle line, does not separate it into two pieces because 
the strip is still held together at the edge. Again, we can make a longitudinal cut one-third 
or less of the way across; this will be found to divide the surface into two pieces, the edge 
portion giving a strip with two edges and a complete twist in it, while the inner portion 
remains as a narrower Mobius strip interlocked with the edge portion. We thus see that 
the original edge of the strip can be deformed into a circle by continuous distortion. 
Further, had the strip been made of a more extensible material, the inner portion, instead 
of being severed, might have been stretched to maintain its connexion with the outer; 
hence a circle can be filled by a one-sided surface. Therefore any closed curve capable of 
being continuously deformed into a circle is also capable of being filled by a one-sided sur¬ 
face. But it is obviously also capable of being filled by a two-sided surface. Imagining the 
deformation now reversed so that the circle is deformed into the edge of a Mobius strip, 
we see that the edge of the latter can also be filled by a two-sided surface.* In what follows 
we shall be entirely concerned with two-sided surfaces. 

Such considerations about properties of figures in space that are maintained in any 
continuous deformation of the figures belong to the branch of mathematics known as 
topology . Such statements as that a plane closed curve has an inside and an outside seem 
trivial at first sight, but are actually true only when the definition of a closed curve is 
made perfectly clear and even then are quite difficult to prove. Again, in the theory of the 
magnetic field of an electric current it is usually taken for granted that the closed circuit 
can be filled in by a two-sided surface; but the edge of a Mobius strip would be quite a 
possible form for such a circuit, and how to fill it up with a two-sided surface is far from 
obvious, though it can be done. 

* For illustrations of a one-sided surface filling a circle, and a two-sided surface filling the edge 
of a Mobius strip, see Courant and Robbins, What is Mathematics? pp. 261, 388. The one-sided 
surface is not simply a mathematical curiosity; one will actually be formed by a soap film stretched 
across a boundary of suitable form. Whether the film will be stable as a two-sided or a one-sided 
surface is simply a matter of which has the smaller area. 
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Corresponding to the length of a curve it is natural to try to define the area of a curved 
surface by taking a set of points in the surface, connecting them up so as to form triangles, 
and defining the area of the surface as the limit of the sum of the areas of the triangles 
when these are made indefinitely small. (With plane polygons of more than three vertices, 
there is, of course, an extra complication; four points on a curved surface do not necessarily 
lie in one plane.) Unfortunately, without some further restrictions on how we are to 
select the points and what pairs are to be joined, this does not lead to a unique definition 
of the area, except when the surface is a plane. Prom the study of the definition of a 
multiple integral it is natural to require that all sides of the triangles should tend to zero 
and not simply that the areas should; but even this is not sufficient. To take an example 
given by H. A. Schwarz,* imagine a circular cylinder of radius 1 with its axis parallel to 
the z axis, divided up by plane cross-sections at interval m. Take on each section n points 
equally spaced about it; for each section the n points are opposite the points midway 
between those chosen in the adjacent sections. Then these points specify a set of isosceles 
triangles with their vertices in the surface, and their sides can be made arbitrarily small 
by taking ra small enough and n large enough. But if A , B are points of a section with their 
cylindrical coordinates A differing by 27 t/u, the midpoint of A, B is inside the cylinder by 
a distance 1 — cos n/n, and the z coordinate differs from that of <7, the nearest point of the 
next section, by ra. Hence the plane of the triangle is inclined to the tangent plane at C 
at an angle 

tan -1 (— sin 2 ^-\. 

\m 2 n) 

This is small only if raw 2 is large; if ra tends to zero like vr 1 the planes of the triangles 
approach the tangent planes, but not if m tends to zero like n~ 2 . Again, the area of a 
triangle is 

.7T / . 7T 

sin-|ra 2 + 4sin 4 —) , 

n \ 2 n) 

2 n 

and that of the — triangles covering length 1 of the cylinder is 

.7r/, 4 . . 7r\^ a 

2nsin- (1H—-sin 4 —-I • 
n \ ra 2 2 n) 

If -> oo, this tends to 2n provided that ran 2 -> oo. Thus the condition that the sum of the 
areas of the triangles shall approach that of the cylinder is the same as the condition that 
their planes shall tend to the tangent planes. If this condition is not satisfied the sum 
may tend to any limit greater than 27T. 

Consequently the definition of the area of a curved surface is more difficult than that 
of the length of a curve. For a curve, so long as the direction cosines of the tangent vary 
continuously with position, the direction of a short chord must tend to that of the tangent, 
and the ratio of its length to that of the arc must tend to 1. For a surface, the corre¬ 
sponding approximations for triangles require a further condition, which can be taken to 
be that all angles of the triangles are greater than some fixed angle 8. But this introduces 
the further question: for what sort of surfaces is such a choice possible however short the 


* Ges. math . Abhand. 2, 1890, 309-11. 
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sides I It has been shown possible* provided that the surface is bounded in all directions, 
and has a normal at every point, such that if Q lf Q 2 ... are points on the surface tending 
to P, the normal at Q n tends to that at P. The last condition is often described by saying 
that the surface has a continuously turning normal. These conditions are somewhat 
unnecessarily strict, being obviously not satisfied by a cube. But since the usual formula 
for the area of a surface is an integral of a function of direction cosines the mere existence 
of the integral implies that they cannot be greatly relaxed if this formula is to be correct, 
and they may as well be adopted. 

If the surface is given in the form 

z = F(x,y), (1) 

where F has continuous partial derivatives F x , F y with regard to x and y, we can cover the 
projection of the surface on the x, y plane by rectangles of sides h f k so that the leaps of 
F x and F y in each rectangle are less than e. Take the points where lines through the comers 
of these rectangles parallel to the z axis meet the surface. In each quadrilateral so deter¬ 
mined on the surface join one diagonal. We thus find a set of triangles with all their vertices 
in the surface; the projection of each on the x , y plane is \hk. If now the vertices on the 
(x, y) plane are (x r , y r ) (x r + h, y r ) (x r , y r +k)> and we label them with suffixes 1, 2, 3, the 
projection on the plane of x constant is 


Uy s - Vi) (** - *i) = &hF x {x r + eh, y r ). 


( 2 ) 


where 0 < 6 < 1. Similarly for (x r + h, y r ) (x r , y r + k) ( x T +h,y T + k) the projection is 

\khF x {x r + d'h, y r + k). 

Then the sum of the areas of the triangles is 

&hk{l + {F x )*+{F v W, (3) 

where F x , F y are to be evaluated in each case for two points of the rectangle in the ( x, y) 
plane. If now we make e tend to zero this sum tends to the integral 

ff(l + F* + F*)*dxdy, (4) 

which exists in the conditions stated. But it must be remembered that this result depends 
on a particular way of choosing the triangles, though it is a very natural way. If we want 
the integral of a function f(x, y, z) over the surface we can take for each triangle the value 
of f(x,y, F(x,y)) for a point of the triangle in the (x 9 y) plane, and if/is continuous the 
result will be JJ f(x, y 9 z) (1 + F% + Pg)i dxdy , which may be briefly written jjfdS. 

The method assumes z single-valued. If a line parallel to the z axis meets the surface 
in two points it will be necessary to treat the two separately; thus the surface of a sphere 
must be regarded as two surfaces, one corresponding to the set of the smaller values for 
given x , y, the other to the larger values. Also it assumes F x , F y continuous and therefore 
bounded, since the integration is over a closed region of x, y; but for a sphere they tend 
to infinity and the integrals must be interpreted as improper ones by first excluding a zone 


* O. D. Kellogg, Foundations, of Potential Theory, 1929, 100-112. 
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about the boundary of the region and then making the width of the zone tend to zero. 
If F x , F y are discontinuous along a rectifiable curve or at a finite set of isolated points, the 
surface can be subdivided so as to make the discontinuities part of the boundaries of the 
pieces, and there is no special difficulty. If part of the surface consists of lines parallel 
to the axis of z, we take it in the form x = G(y,z) or y = H ( z , x). 

The integrals are usually improper, and to avoid difficulties connected with infinite 
series of improper integrals it is usual to require that the surface shall not be intersected 
an infinite number of times by any line parallel to an axis. The restrictions on the types of 
surface considered are therefore rather numerous, but fortunately they are satisfied by 
most of the surfaces we meet in practice. 81 

The surface may be given parametrically by x = x(£, rj) and similar equations, where 
x, y , z have continuous derivatives with regard to £ and tj, If we take A, fi so that the leaps 
of these derivatives are always less than e when | £1 — £ 2 I < A, I Vi ~ Vz I ^ we can apply 
a similar argument to triangles specified by rectangles in the £, tj plane of sides A, [i. It is 
found that the sum now tends to 

jjJdt-dri, ( 5 ) 


where 



and that the direction cosines of the normal are given by 





d(y,z) d(z,x) d(x,y)\ 

3(£,*>’0(£,*)’a(£,*)/ 


( 6 ) 


(7) 


the sign being positive if the rotation for d£ to drj is positive about the normal. By 
direct transformation of variables it can be shown that this integral is equal to that 
given by (4) and reduces to it when £ = x,ij — y, so that the rule for the area is in¬ 
variant for different parametric representations of the surface. This is specially important 
because the method of triangulation adopted is specially chosen for each representation 
and it is therefore necessary to verify consistency. 

The difficulties of a definition of area for a general surface are considerable. Lebesgue 
avoided the difficulty of Schwarz by a process involving the use of lower bounds of the 
sum of the areas of triangles approximating to the surface, thus preventing gross over¬ 
estimates of the area, but Besicovitch has shown that it can lead to gross underestimates. 
He has invented a set of points with a finite area according to Lebesgue’s definition, but 
with a positive volume.* The best general definition seems to be by C. Caratheodory.f 
All these definitions lead to the same integral formula for the areas of ordinary surfaces. 
Necessary and sufficient conditions for the formula to be given by Caratheodory’s defini¬ 
tion are given by Besicovitch.t 

In suffix notation, in transforming an integral over a surface we can interpret ^dS as 

l i dS = e ik J^ d ^dgd V , ( 8 ) 


* Q, J . Math (Oxford series), 16, 1945, 86-102. 
f Gott. Nachr. 1914, 404-26. 

J Q. J. Math (Oxford series), 20, 1949, 1-7. 
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and e ips l t dS = d -^^d£d V , (9) 

subject again to the restriction that <Z£, dr/ and the normal form a right-handed frame. 
If they form a left-handed frame the signs are reversed. 


5*08. Green’s lemma. This theorem, otherwise known as Gauss’s theorem, 
Ostrogradsky’s theorem,a and the divergence theorem, asserts that if V is a closed region 
and S its bounding surface 


provided that the triple integrals of du/dx, dv/dy, dw/dz through V exist and no straight 
line parallel to an axis meets 8 more than a fixed number of times; l 9 m y n are the direction 
cosines of the outward normal to 8. It is to be understood as in 5-05 that the integral of 
du/dx through V means the integral through a parallelepiped D enclosing F, bounded by 
planes x = X l9 X 29 y = Y l9 Y 29 z = Z l9 Z 2 , of a function p(x 9 y 9 z) equal to du/dx in V and 
zero outside it, and the statement that the integral through V exists means that the 
function p(x , y 9 z) is integrable through D. 

By 5-051 the existence of the triple integral implies that the integral can be evaluated 
by successive integration. Then 

JJjr^dxdydz = JV dydz p(x 9 y 9 z)dx. (2) 

For any pair of values of y 9 z that are taken by no point of V, p is zero for all x . If a line 
parallel to the axis of x does meet S 9 label the intersections 1, 2, 3,in order of increasing 
x , so that odd suffixes will correspond to points of entry and even ones to points of exit. 
The number of intersections 2k is bounded with respect to y 9 z. Then along such a line 


r*du 

dx 




and 


ppdx- r 

J X x J x 

JT/f Tx dxiyiz = Jr'jT 1 ( “ l V u i d y dz = f (- ^JJ u l d v dz 


(3) 

(4) 


the integrand being taken as 0 for values of y 9 z corresponding to no point of S. 
Now from 5-07(4) 

■ . m-Nsr*©r 

and for any portion of S 

jjdS sb JJdydz/l l\. 


(5) 

( 0 ) 


But we see by reference to a figure that at a point of exit the outward normal makes an 
acute angle with the x axis and therefore has an x direction cosine l — 1 1 1; at a point of 
entry l = — 1 1 1. Hence the integral on the right of (4) is JJ ludS taken over all points 
of intersection. But these points of entry and exit cover the whole of 8 except where a line 
parallel to the axis of x lies on S over an interval of x; and on such a line 1 = 0 and therefore 
any part of 8 composed of such lines makes no contribution to the integral. 

it 


JMP 
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5*08 


W© obtain similar results for jjj^dxdydz and JJj^dxdydz by integrating first 


with respect to y and z respectively, and by addition we have the theorem (1). 

The expression Id8 is really to be regarded as an abbreviation for ± dydz , the sign being 
chosen according as points of a neighbourhood are points of exit or entry, so that the right 
side of (1) is to be regarded as the sum of three ordinary double integrals and does not 
involve the special difficulties of the area of a surface. In evaluation of the surface integral, 
when the surface is expressed parametrically, the formula 5-07(8) will be used: 


JJ l iUi dS = JJ e^u^^didv. 


(7) 


where drj form a right-handed set of directions. 

The above proof gives greater generality than is often needed. Sufficient conditions 
usually satisfied are (1) 8 is bounded, has a continuously turning normal except possibly 
on a finite set of rectifiable curves, and is not intersected more than a fixed number of 
times by any straight line (2) the derivatives du/dx, dv/dy, dwjdz exist in V and are 
bounded, and are continuous except possibly on a finite set of surfaces of finite area. 
There is no objection to V having a hole in it. It may, for instance, be the region bounded 
by two concentric spheres. Then 8 consists of both the outer and the inner boundary, so 
that the normal at the inner boundary is taken outwards from V and therefore toward 
the centre. 

If u, v, w are the components of a vector u i we can write the theorem in the forms 


SSSM ir --if**- <8) 

u n being the component of u along the outward normal. The method of proof given treats 
the terms separately and there would be no special advantage in using vector or tensor 
notation in it. The proof that we shall give later (11-053) uses a suffix notation, and is 
somewhat more general, but also is more difficult. 

A common practice, especially in German books, is to write the integrals with only one 
sign of integration. This has the disadvantage that for evaluation the integrals must be 
written as double or triple integrals, and some confusion can arise: there are no variables 
of integration 8 and r. 

If all the components are independent of z and we apply the theorem to the region 
between two planes of constant z and a cylinder with its generators parallel to the z axis, 
we get the two-dimensional form of the theorem 



dxdy =J (lu + mv) ds. 


(9) 


For the ends contribute nothing to the surface integral, since l = m = 0 there, and the 
values of nw at corresponding points of the two ends are equal and opposite; and the 
length of the cylinder cancels. This result is also easily proved directly. 

Replacing u by v and v by — u we have 

jS{^- d ^) dxdy= S {iv - mu)da - 


( 10 ) 
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But if we proceed along the tangent in the positive direction (i.e. keeping the area on the 
left), the direction cosines of the tangent are ( —ra, Z) = (< dx/ds , dyjds). Hence the integral 
on the right is J (udx + vdy) taken around the boundary in the positive sense. This is the 
two-dimensional form of Stokes's theorem . 


5*081. Green’s theorem. In Green’s lemma put 

TJ dV 

** - 


Then 




that is 

where by dV/dn we understand differentiation along the outward normal. Similarly 

///— 

and therefore (Z7V 2 F — FV 2 Z7) dr = 


provided that UdV/dx i and VdU/dx i have first derivatives integrable within the region; 
that is, that the second derivatives of U and F exist and are integrable. 


5*082. Stokes’s theorem. Let C be a simple rectifiable closed curve and S a two- 
sided surface with C as its boundary. Let x i be the coordinates, three functions of 
position with continuous first derivatives. Stokes’s theorem is that 


or in vector notation 


L u < dx ‘-Sl e l “ t ^ ds ’ 

u.dx — I I I.curlw. dS 9 
Jc JJs 


(1) 

( 2 ) 


where are the direction cosines of the normal to the surface at the element dS. The sense 
to be taken for the normal is given by the consideration that if G is continuously deformed 
and displaced so that it is described in the positive sense in a plane of z constant, and S lies 
in this plane, the normal becomes parallel to the positive direction of the z axis. 

We suppose the surface expressed in terms of parameters £, rj. Then 


if 


e^L^dS- 


■-//£( 

by the two-dimensional form of Stokes’s theorem. But this is just J 


( 3 > 


I3-* 




196 , Flux and circulation 5*09 

The proof assumes that d 2 x m ld£,dri and d 2 x m /dr/d^ exist and are equal; this can be 
proved on the supposition that they are continuous.* 

n . dS is often written as dS, regarded as a vector with magnitude dS directed along ft. 

5 # 09. Flux and circulation. If u t is a vector, n< the direction of the normal to a 
surface, Jjn^dS taken over a surface is called the flux of u t through the surface. In 
hydrodynamics, if is the velocity of a fluid at any point, the flux is the volume of fluid 

passing through the surface per unit time; hence the name, j u i dx i taken around a 

circuit is called the circulation around the circuit. Then Green’s lemma can be read: 
The flux of a vector through a closed surface is equal to the volume integral of its diver¬ 
gence through the interior. Stokes’s theorem can be read: The circulation of a vector 
around a circuit is equal to the flux of its curl through a cap filling the circuit. 

If the u i are differentiable at P and we take any sequence of surfaces of constant form 
and orientation surrounding P, the diameter and volume of S n being a n , V n , it follows from 
5*04(5) that 

(,) 

where | v t |-»-0 uniformly with a n . 

The last term tends to 0 with a n , since JJJ rdS n = 0(a%) = 0(V n ). Hence 

< s > 

This result is particularly useful when the divergence of a vector has to be expressed in 
terms of curvilinear orthogonal coordinates. 

1 Vectors with zero divergence everywhere in a region are called solenoidal in the region. 
The flux of a solenoidal vector through any closed surface in the region is zero; and con¬ 
versely, by the last result, if a vector has zero flux through any closed surface in a region, 
it is solenoidal in the region. Especially, in the flow of a fluid of density p, where p may 
be variable, the rate of transfer of mass through a surface is JJ pl^dS. If the mass within 
every surface remains constant with time, the vector pu t is therefore solenoidal and 
div (pu) = 0. In particular, if the fluid is homogeneous and incompressible, so that the 
mass within a closed surface within it is necessarily constant and dpjdx$ = 0, we have 
simply divtt = 0. In many hydrodynamical problems the latter condition is satisfied. 
A sufficient condition for a vector to be solenoidal is evidently that it shall be the curl of 
another vector. We shall prove in 6-11 that this condition is necessary. 

Vectors with zero circulation about every circuit in a region are called irrotational in 
the region. If A(^) and B(x i ) are any two points in the region and we connect them by 
two different paths L and L' in the region, 

j L u i dx i =j^u i dx i . (3) 


* Cf. Gibson, Calcutta, 221-2. 
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For the path along L from A to B together with the path L* from B to A form a closed 
circuit in the region, and if u { is irrotational the integral around it will be zero. But in this 
circuit the part L ' is traversed in the opposite direction; hence the result stated follows. 

Thus J u i dx i depends only on the termini and not on the intervening path, and can be 

expressed in the form <p B — <f> A , where 0 is a scalar function of position. Now take a point 
B' with coordinates (x x + Sx x , x 2 , x 3 ). To get from A to B' we can go to B and then on to 
B\ and 




rs' 

I U^X^U^y 

J B 




whence 



= 


(») 


and by symmetry 


U: = 




Hence an irrotational vector is the gradient of a scalar. Again, 

du t _ c-(j> _ cu k 

dx k dx t dx k dXf ’ 


( 0 ) 


and therefore an irrotational vector has zero curl. Conversely, if a vector has zero curl 
everywhere in a region we can apply Stokes’s theorem to show that its circulation about 
any circuit capable of being filled by a cap in the region is zero and therefore it is irrota¬ 
tional, and therefore, by the above argument, it is the gradient of a scalar. 

If «,• is both solenoidal and irrotational, <f> exists and then, since duJdXf = 0, 

dx\ dxi + dxi + dx 2 3 ~ ’ 

which is written compactly 

V 2 0 = 0. (8) 


EXAMPLES 


1. Prove that, if —7r<oc< 7 T, 




exp( — x 2 — 2xycoscc — y 2 )dxdy — -. 

o 2 sin a 

2. S, T are the fixed points (0, 0, ± \R), and P is the variable point (x, y, z); the distances PS, PT 
are denoted by s, t respectively. If 

„ $+£ s~t 

g = ~R’ ’ = 

and (j) is the angle between the plane PST and the plane y = 0, show that 


[ d{x,y 9 z) 


&ti 2 -v 2 V 


Hence prove that 


j J J — e ‘’ <t + i)lR dxdydz = 


Hie integral being taken over all space. 


(Prelim. 1942.) 
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3. By using the transformation 
or otherwise, evaluate 


Examples 


x + iy = (w + w) 1 
$$(x* + y*)- l t*dxdy 
taken over the region enclosed by arcs of the'confocal parabolas 

V 2 = 4a r ( x +Or) (r = 1* 2,3), 

where a x > a % > 0, OgCO. 

4. Prove that the integral 


J 


dS 

V 


taken over the surface of an ellipsoid of semi-axes a, b, c, where p is the length of the perpendicular 
from the centre on to the tangent plane at a point of the ellipsoid, is equal to 


Evaluate 


f dS 

J V 8 ‘ 


(M.T. 1940.) 

6. Determine the new limits in the following integrals, when the orders of integration are reversed: 
ri ri rVair rb cosec d 

I dx\ f(x,y)dy, dd\ f(r,0)dr. (I.C. 1940.) 

J o J J fi Jo 


6. Express the integral 


ra rt 

dz\ dy I 
Jo Jo Jo 


dx 


in spherical polar coordinates, and show that its value is 7ra 3 /24. 


(I.C. 1938.) 


7. If K n is a homogeneous polynomial of positive integral degree n in the coordinates, satisfying 
V 2 K n = 0, and £ is a sphere of radius a about the origin, prove that 




8. If A and B are two vector functions of position, prove that 

div(JLAB) = B. curl A —A. curl B. 

If further A and B are functions of the time and are connected by the relations 

dA , „ dB 

curlS, ~ — — curl <4,, 
at dt 

and r is a volume enclosed by a fixed surface S, prove that 


9. Prove that 


IZ!!I' a,+b 2 )dr = - JJVlAB).dS. (M/c, Part III, 1931.) 


div(0-4) = 0div-A + -A.grad$. 

If div D = 4arp y E = — grad <f>,D = KE, where K is independent of E, show that, if <j> is 0( 1/r) at infinity 




the integrals being taken through all space* 





Chapter 6 

POTENTIAL THEORY 


4 But all that moveth doth Mutation love.* 

spbnser, The Faerie Queene , Bk. 7 

6-01. l/r as a solution of V 2 0 = 0. Let x t be the coordinates of a point P and r its 
distance from the origin. Then 

_9_ 1 _ d_ /1\ dr _ x { 

dx i r dr\r)dx ( r 3 ’ ^ 

0 2 1 
dx { dx k r 


Now put k = i and apply the summation convention; since 


__0_£* _ 1 dx { 


3x,-x, 


dx, 


i*k 


r 3 dx 


* r* 


_ 3x i x k — r 2 S ik 
r 6 


( 2 ) 


x i x i - r* 


hi = 3, 


0 2 1 


dx i dx t r 


-=°> 


(3) 


so that l/r is a solution of Laplace’s equation, V 2 0 = 0, except at the origin. 
It follows at once that if £* are the coordinates of another point Q, and 


(summed), then 


dx\~r ~ °’ 


(4) 


0 1 0 1 

except at x i = Note that ^ ~ ~, a result that will be needed repeatedly. 

further, if we take n points Q 1 ,.. Q n and denote their coordinates by and the 
various distances Q 8 P by r 8) then 


n a 

v 2 s £ = o, 

*=!»•» 


(5) 


where a, are any constants, except when any of the r„ is 0, that is, when P coincides 
with any of the points Q s . Differentiation is of course understood to be with regard to 

the coordinates of P. Hence with this restriction any function of the form S ajr a is 
a solution of Laplace’s equation. 8 1 

Now the gravitational potential due to a distribution of particles is of this form. So is 
the electrostatic potential due to a set of point charges. Hence both satisfy Laplace’s 
equation. This equation arises also in the hydrodynamics of an incompressible fluid. For 
if is the velocity at P(x { ) the condition that the mass within any closed surface is 
constant requires that u t is a solenoidal vector; and for any circuit capable of being filled 
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by a cap occupied wholly by fluid that has never passed near a solid boundary the cir¬ 
culation ju i dx i round it is practically zero, so that to a good approximation is also 
an irrotational vector. In classical hydrodynamics we adopt the approximation 


Ui = 


dx/ 


where <f> is a scalar function of the coordinates, satisfying Laplace’s equation, and called 
the velocity potential . The solutions of this equation therefore contain the whole of the 
part of hydrodynamics that neglects viscosity and treats the fluid as having a constant 
density independent of pressure and of any other complication such as variation of 
temperature or composition. These conditions are not satisfied in any real fluid, but in 
many actual motions of fluids about solid boundaries they are satisfied within the obser¬ 
vational uncertainty except in parts of the fluid that have passed close to a solid; and the 
modern development called boundary layer theory deals with the latter regions, which are 
usually thin. 

If r is the distance of P from the origin, rJr-> 1 when r->oo. Hence when r is large 
enough 

<«> 


/ * a, 

8 = 1 '« 


- 

s — 1 


Thus unless Sa s = 0, <j> behaves for large r like 1/r. If 2 a, = 0, <j> will decrease more 
rapidly than 1/r. 

Now consider the flux of the gradient of 1 jr through a sphere of radius a about the origin. 
The direction cosines of the outward normal are xjr; hence 

But the area of the sphere is 477a 2 ; hence 

m>-X' 

This can be extended to any closed surface surrounding the origin. For if we take such 
a surface S y and take a sphere E large enough to enclose 8 completely, we can apply 
Green’s lemma to the region E — 8 between 8 and E. Using djdv to denote differentiation 
along the outward normal from this region, which will be outwards from O over E and 
inwards over S 9 we have 

-JL. <8) 

The left side is zero. The contribution to the right from 2 is — 4 n. Hence 


l*ldS = 

dxr 


-47r. 


f7) 


iUG> 


^ - US - 4 *. 


19) 


Since dv is here taken towards the side containing O, it follows that if we take dn out¬ 
wards from 0 .. , ti /i\ 

luO- 4 ”- ,10) 

which extends (7) to any surface enclosing the origin. (An easy application of Green’s 
lemma shows that this integral is zero if S does not enclose the origin.) 
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In classical hydrodynamics, if the flow is radial and symmetrical about the origin, and 
u is the velocity at dis tance r from the origin, u is a function of r only. The rate of outflow 
through a sphere of radius r is then knr 2 u, If the region between two such spheres of 
radii r j and p 2 is filled with fluid throughout the motion, it follows that r\u^ = r\u %; thus 


m 

^2* 


( 11 ) 


where 47rm is the volume of fluid issuing from the origin per unit time. Then 


y. < 12 ) 

apart from an irrelevant constant. If m is positive, this is the velocity potential of a source 
emitting volume 4 nm of fluid per unit time; if m is negative, it is that of a sink. That due 
to any set of sources and sinks is obtainable by addition as for gravitation and electro¬ 
statics. 

The resemblance between these three branches of mathematical physics, to which the 
flow of electric current in a uniform conductor may be added, is so close that the mathe¬ 
matical theory common to all is most conveniently developed in one piece. With a slight 
modification much of the theory is also applicable to magnetism. It is known as 'potential 
theory. 

It follows at once from (10) that if 

* = S 7* (13) 

JJ|^S = -47rS'a,, (14) 

where the summation in S' covers all the a 8 corresponding to the points Q 8 that lie within 
S; for if Q s lies within S, ajr 8 contributes - ±na 8 to the sum, and if Q 8 lies outside 8 the 
term contributes 0. This is a case of Gauss's theorem, 

6*02* Continuous distributions. The application of these results to continuous dis¬ 
tributions meets with two difficulties, one mathematical and one physical. If we identify 
our particles with protons and electrons, the gravitational and electrical potentials will 
consist of finite sums of the form just discussed. In practice, however, the number of 
elements in any piece of matter of ordinary size is so large that the working out of the sum 
would be impossible, and also there are additional forces at short distances. But just for 
this reason another method becomes possible. An integral is the limit of a finite sum when 
the number of intervals becomes very large, their total length remaining the same. Thus 
a sum over a large finite number of intervals is a good approximation to the integral; 
but, conversely, the integral is a good approximation to the sum. This suggests that 
instead of taking the matter or charge as concentrated in separate points we may take it 
as distributed continuously through the volume, in such a way as to keep approximately 
the same total mass or charge in any given region containing many elements. It is 
obviously impossible to make the totals exactly the same for every region. For if so we 
could take a closed surface surrounding a single element; the mass within it will tend to 
zero when the surface is taken small enough if the distribution is continuous, but to a 
finite limit for the actual distribution. The approximation is in fact good only for expres- 
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sing average properties of regions containing many elements, and will not deal with 
individual particles. The former type of properties are usually called molar or macroscopic, 
the latter molecular or microscopic. (It need not be inferred from this choice of words that 
an actual microscope is capable of observing individual molecules!) If, for instance, 
a region not exceptionally long in one dimension in comparison with the others contains 
n elements, where n is large, in the actual distribution, and the equivalent mass or 
charge of n ± in the continuous one, we have the sort of approximation required. 
The density at each point can be identified with the ratio of the total in such a 
region to its volume, and similarly for the charge density, and both will be finite every¬ 
where. 

The physical difficulty is that if a solid consisted entirely of stationary particles, under 
no forces except inverse square ones, it could not be stable and would collapse to zero 
volume. This has been partly met either by supposing additional rep ulsi ve forces, 
considerable at short distances, but falling off with distance faster than r -2 , or by supposing 
the particles to be in rapid orbital motion. In the former case the additional forces must 
be studied separately; this method has been adopted especially by Bom and Lennard- 
Jones in the theories of crystals and gases. In the latter the potential due to a given body, 
even apparently at rest, will be a rapidly varying function of the time. The quantum theory 
aims at combining both suggestions into a single hypothesis. The various solutions all 
make the forces between particles follow the inverse square law so long as the His+,n.nn« 
considerably greater than 10 -8 cm., and their mean values over intervals of the order of 
10 -17 sec. change little from one such interval to the next. Consequently the ma.t.hp.rnfl.tical 
and physical difficulties can be met in the same way: the predictions of the inverse square 
law will be right provided that we apply them only to changes of mean position or 
momentum, during intervals of time longer than about 10~ 17 sec., of the matter within 
regions greater than about 10 -8 cm. in linear extent, and when they are right they will 
be approximately the same as those for continuous distributions that preserve the same 
total mass or charge within such regions. 

In our formula for (j) we therefore replace a g by pd^d^d^, so that specifies a point 
of the distribution. We continue to use x i for the point where <f> is being evaluated. Then 
we are led to study the function 

4> = 7 JJJ ^d£id£ 2 d£ 3 , (1) 

where now p will be a function of £ x , £ 2 , £ a , the coordinates of a point Q; x v x 2 , x % are 
coordinates of a point P, and R is the distance QP given by 

R2 = ( 2 ) 

y is a constant with different values in different branches of physics, For shortness we 
write = dr, so that dr is an element of volume. 

We suppose as before that p = 0 for points at more than a given distance from the 
origin. Then again when r is large 

r<f>-+yfffpdr. ( 3 ) 

If p = 0 throughout any region, V 2 0 = 0 in that region. For if we differentiate (1) under 

the integral sign we get 


w 
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6-03 Uniform spherical shell 

For values of R sufficiently small, Q will be in the region where p = 0. The integrand is 
therefore a continuous function of x v x 2 , x 3 , wherever R does not vanish, and the ranges 
of integration are finite. Differentiation under the integral sign is therefore justified. 
Then, by addition, we get as before V 2 0 = 0. 

Important intermediate cases between discrete particles and continuous distributions 
are surface and line distributions , in which the mass or charge per unit area or unit length 
respectively is finite. It is again obvious that the potential satisfies Laplace’s equation 
except possibly on the surface or line. 

6*03. Uniform spherical shell. Before proceeding further with the general theory 
we consider a few important special cases. Take first a uniform surface density <r over a 
sphere of radius a. <f> is obviously independent of direction and therefore is a function of 
r only, and it must satisfy Laplace’s equation except actually on the sphere. Now if 
accents denote differentiation with respect to r, and 0 is a function of r only 


dx t r ** ’ dx i dx k 


^6 
r T 


r 3 > 




vv 


3., <j> ' 20' d 2 . , u 

= -*'“+* = * +"ir - ^(^)- 


Hence since V 2 0 = 0 everywhere except on the sphere 


, ,, B 

$ = A + -, 


( 1 ) 

( 2 ) 

(3) 


where A and B are constants. But when r is large 

r0-^y jjcrdS — Anya 2 cr. 

Therefore outside the sphere A — 0 and B ~ Anya 2 cr\ and* 

. 47rya 2 cr . 

0 = —-- (r > a). 


( 4 ) 


(5) 


At the centre <j> is obviously finite and equal to 47ryo<r; hence B must be 0 within the 
sphere and 

0 = 47 Tyacr (r < a). (6) 

Thus 0 has two distinct analytic forms inside and outside the sphere. Its radial 
derivative is 


30 47rya 2 cr 

dr r 2 

= 0 (r<a) 9 


(r>a), 


(7) 


* No stronger evidence for the need for an adequate notation could be required than that pro¬ 
vided by the history of this formula. There is good reason to suppose that Newton delayed for about 
twenty years his publication of the comparison between the gravitational acceleration of the moon 
with gravity at the earth’s surface because he was unable to prove that the attraction of a sphere 
at all external points was the same as that of a particle of equal mass at its centre. But the proof 
from Laplace’s equation is so easy as to be almost trivial. Why did Newton not discover Laplace’s 
equation? Presumably because his fluxion notation did not allow him to contemplate more than 
ooe independent variable and could not express partial derivatives. 
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Sphere; spherical cap 

and therefor© has a discontinuity — 47rycr as r increases through a . <f> itself, however, is 
continuous on crossing the sphere. This is a general result: the potential is continuous 
when a surface distribution is crossed, but the normal derivative has a discontinuity of 
- 47ryo\ 


6*031. Uniform sphere. Next consider a uniform sphere of density p and radius a* 
We can imagine it built up of concentric spherical shells of radius a, thickness Sot, and 
therefore surface densities pSot . Then at points outside the sphere 

, 8 ) 

as we should expect. For internal points it is easiest to work with 8<f>jdr, since shells with 
a > r make no contribution to this. We have 



d(j) r r 4:7Tpx 2 , . 

Sr r Jo r 2 dcc= * nypr (r < 

( 9 ) 

At the centre 

(j) = ±nyp | a dot = 27 rypa 2 , 

Jo 

(10) 

and therefore 

<j> = 27ryp{a 2 — ^r 2 ) (r < a). 

(U) 

Hence both <j> and 8<j>j8r are continuous on crossing the sphere, for 



lim = lim27 ryp{a 2 — \r 2 ) = %Ttypa 2 , 

r-+a 3 T r-><* 

(12) 


lim ( o y L ) = Urn ( f nypr) = i^7P a - 

r->a \ o T J r -> a 

(13) 


But <f> for r < a does not satisfy Laplace’s equation; for 

0 

g^(—i ”Ypr 2 ) = - I nypx it 

V 2 ( — inypr 2 ) — — teryp. (14) 


This is Poisson*8 equation and expresses a general property of the potential inside matter. 
It follows that when the density is discontinuous we can expect the potential and its 
first derivatives to be continuous but at least one second derivative to be discontinuous. 

6*032. Spherical cap. Consider now the potential on the axis of a segment of a 
sphere of radius a and semi-angle a. 


We have * - = 7*£/, 


2n a 2 sin6d0d\ 
o 


R 


(16) 


6 and A being spherical polar coordinates of Q . But 
R 2 = a 2 +r 2 — 2ar cos 0, 

RdR — ar sin Odd, 
for variations of Q; hence 
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and again d(j)jdr has a discontinuity — 4arycr when r increases through a. R for 6 = 0 must 
of course be taken positive, | r — a |, whether r > a or r < a, being simply the distance AP . 
If a = 7 T, so that the segment becomes the whole surface of a sphere, we recover the 
formulae (5). 

If a becomes large, with r — a = x and a sin a = b fixed, we get the case of a circular 
disk of radius b. In this case 


= 27ry<r{ A /(6 2 + x 2 ) — | x |}, 

(19) 

»}’ 

(20) 


which tends to + 2ny(r or — 2ijyor as x->0 from negative or positive values respectively. 
Formulae (18) (20) are correct, of course, only when P is on the line of symmetry. But 
they show how the discontinuity in the normal derivative maintains the same value 
oven though the values on one side range from zero to 27 rycr as a varies. 


6*033. Line density. As an example of a line density consider a distribution A per 
unit length along the axis of z, from £ = — a to +a. We have 


If we put 


'-W-i 


t {x 2 + y 2 + (z~ £) 2 } 1/a * 

(x 2 + y 2 f k = xu, z — £ — xut&nd, R = {x 2 +y 2 + (z — £) 2 } lli = xusecO, 
R x = [x 2 +y 2 + (z — a) 2 } 1/a = xu sec 6 V R 2 = {x 2 + y 2 + (z 4- a) 2 } 1/a = xu sec 6 % , 

this becomes f*. /sec 6 2 + tan d 2 \ R 2 + z + a 

1 J 6l 1 6 Vsec^ + tanflJ 1 1 B R x + z-a 

x , R± + R 2 + 2a 


( 21 ) 


( 22 ) 


since 


4az = R\ — R\. 


If R x + R 2 ^>2a, this tends to +oo logarithmically; but if R x and R 2 are both large the 
limit is finite. If a is very large and x,y,z not large 


9 i=-2 7 Alog^ 

This is a solution of Laplace’s equation; for 


(23) 


^log (* 2 +y 2 ) 

^2 log (x 2 +y 2 ) 


2x 

x 2 + y z> 

2(x 2 —y 2 ) 
(x 2 + y 2 ) 2 ’ 


^log(x 2 + y 2 ) 


2(x 2 + y 2 )-4x 2 2 (y 2 -x 2 ) 

(x 2 + y 2 ) 2 ~ (x 2 + y 2 ) 2 * 


02 

^log(x 2 + y 2 ) 


0 . 


(24) 


Thus log xu/2a is a two-dimensional solution of Laplace’s equation. Its gradient is inde¬ 
pendent of a , which is needed only because log xu is, strictly, meaningless because xu is a 
length, not a number. This is the simplest possible case of another general result: near a line 
density A the potential tends to infinity like - 2yA log xu/b, where xu is the shortest distance 
to the line and b is some fixed length, even though the line may be curved. 
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Doublet 


6 * 034 - 6*035 


6*034. A closely related two-dimensional potential function, as we shall see in the 
theory of the complex variable, is 


— tan -1 -. 
x 


(25) 


For here 


and 


90 _ y 1 __ y 3 2 0 _ 2 xy 

dx x 2 l+ y 2 jx 2 x 2j ty 2 ' dx 2 ( x 2 + y 2 ) 2 ’ 

30 _ 1 1 _ x d 2 (j> 2xy 

dy xl +y 2 /x 2 x 2 + y 2> 3y a ~~ (x 2 + y 2 ) 2 ' 


V 2 0 = 0. 


(26) 


The magnitude of grad 0 is 1/tu, as for 0 — log vj, but its direction is along the circle 
w — constant whereas that due to 0 — log wjb is radial. 0 is not a single-valued function 
of x and y, but its derivatives are single-valued. 

Volume densities are good approximations to those that arise in problems of gravita¬ 
tion, and surface densities to those of electrostatics. Line densities are more difficult to 
realize in the theory of attractions, but the particular potential distribution log (wjb) is 
important in two-dimensional hydrodynamics and the flow of electricity in plane sheets. 
The form t&n^y/x occurs in vortex motion and the magnetic field due to an electric 
current. 


6*035. Doublet. Another type of potential of theoretical importance is that of a 
doublet or dipole. If 

<2,) 

where R\ = (x - a) 2 + y* + z 2 , R\ = {x+ a) 2 + «/ 2 +z 2 , (28) 

and we make a-*-0, A-*-oo, in such a way that 2aA -> ju, 

, 3 1 x 

<l> ^- yfl d-x~r = yfl 7*- {29) 

This is a solution of Laplace’s equation except at r — 0, since 

V 2 0 = -y/*V 2 J^ = - r/ *lv 2 i = 0. (30) 


<f> is called the potential due to a doublet of moment fi at the origin, directed along the 
axis of x . For a doublet of moment p in a direction l i at 

, t 9 1 i 3 1 

< 31 > 

where R 2 = fo-^) 8 . (32) 

The direction l { is called that of the axis of the doublet. 

The doublet field represents closely that of a small bar magnet, and occurs in many 
physical problems. The equipotential surfaces are closed, with the axis of the doublet 
as axis of symmetry, and all touch at the doublet. 
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6-036. A doublet shell is a distribution of doublets over a surface so that their axes 
are all along the outward normal to the surface. Nothing like it exists, but some general 
theorems make use of its properties. If /idS is the moment per element of area dS, and 
l 4 are the direction cosines of the normal, 


* -- y S! A ^) ds ~ r i! A kH) iS - (33) 


This is capable of a simple transformation. Take P to be 
the point x i9 Q to be £ t *, and let QP make an angle x with 
the normal at Q. Then 


, _9_ J_ li(*i-£i) _ cos* 
*d^B B 2 B* • 


(34) 


If a cone is drawn from P to the boundary of dS , and 
we draw around P a sphere of radius a, the area of the 
intercept on this sphere is 


a 2 da) = oc 2 ^~dS. 
R 2 



(36) 


The ratio da) is called the element of solid angle subtended by dS at P, and the sum of the 
a 2 do) is the intercept on the sphere by a cone joining P to the boundary of S; and 

<f> = yjjpdcj, ( 36 ) 

a remarkably simple form. It is important to attend to the sign of dw, which is that of 
cos x- In the figure do) is positive for <fi P , negative for <f> Pi and <f> Pt . Evidently the potential 
at a great distance behaves like r -2 instead of r -1 ; and if p is constant over a closed surface 
0 = 0 at all external points and <j> = — Awy/i at all internal ones. Hence <f> has a dis¬ 
continuity 477 y /4 on crossing the surface outwards. For a uniform plane doublet sheet 0 
jumps from — 2777/4 to 2777/4 on crossing it. 

Consider a uniform doublet shell filling the half-plane y - 0, x<0, with the axes 
directed towards positive y. The solid angle subtended at P is 6/2n times that subtended 
by a whole sphere about P, and therefore is 20, where 6 is tan" 1 7 //*. Hence 

0 = 2yfi0 = 27/4 tan - 1 1 . ( 37 ) 

When 0-*-n, <f>^-2ny/i\ when 0-y—n, <j>-+~2nyfi, and therefore again there is a dis¬ 
continuity 4777/4 on crossing the shell in the direction of the axes of the doublets. In this 
case (j> does not tend to 0 as r -*■ 00 , but that is because the sheet does not satisfy the con¬ 
dition that all matter is within some given finite distance from the origin. 

To sum up, if we call a surface of simple discontinuity of density a discontinuity of 
zero order in the density, we can call a surface density a discontinuity of order — 1 , 
and a doublet shell one of order - 2 . All can be regarded as limits of continuous 
distributions. If p = p 0 tanha:/a, and a->0,p->p o for x>0 and -*-—p 0 for x<0. If 
cr 1 x 2 \ r h 

P * oV 5 r eXp [ ~a 2 ) >p -*° aS a ~*° for any * + 0 > but J pdx~^a however small h may 
2 / r>h Ph 

b ®‘ If/5 = o\^ i “ a:exp \ - a 2 / ,P ^' 0foranya:+0 ’J h pdx ~* 0, \ P xd x->p. The order 
is that of increasing irregularity in the continuous distribution needed to give the 
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requisite properties in the limit: in the respective cases /> max = 0 ( 1 ), 0 (a _1 ) and 0(a~ 2 ), 
where a represents the linear scale of the distribution. The corresponding orders of 
discontinuity in the potential, measured by the orders of the lowest discontinuous 
derivatives, are 2 , 1 , and 0 . The rule can be extended to discontinuities in higher 
derivatives; in most practical problems the order of the discontinuity of the potential 
is higher by 2 than that of the density. 

6-04. Potential and field inside a continuous distribution. We have now to con¬ 
sider more general cases. V 2 0 = 0 always holds when P is outside matter. When P is 
inside matter (which would be impossible for particle distributions), p does not vanish at 
P and the integrand tends to infinity. It is therefore necessary to define <j) as an improper 
integral by first excluding a small regiog. about P from the region of integration and then 
taking the limit of the integral when the diameter of the small region tends to zero. In 
order that the limit shall exist we shall require that its value shall be the same for every 
shape of the excluded region. 

Lemma. The integral jjj ~, where R is measured from a point P in the volume of integra¬ 
tion F, converges if w < 3 ; for all regions of the same volume the integral is largest for a sphere 
with centre P for 0 < m < 3. 

Take r to be the region inside a sphere of radius a with centre P and let r' be a region 
entirely inside r. Then 

IL*-**" 

< 47 T f R 2 ~ m dR = a 3 ~ m (m< 3). (1) 

Jr 3 —TO 

477* 

For m < 3, given any positive e, we can take a so that ——— a 3 ~ m < e; then there is a sphere 
of radius a such that 

for any r' included in r. Hence if we exclude a cavity around P from the volume of in¬ 
tegration the integral tends to a unique limit, independent of the shape of the cavity, as 
we make all its dimensions tend to zero. 

Suppose now that we replace the volume of integration V by a sphere with centre P of 
radius 6 with the same volume. Then 

V Parts of sphere Parts of V not 

not in V in sphere 

In the integrands on the right the volumes of integration are equal; in the first R < b 
and in the second R > b except on parts of the boundary, where R — b. Hence the right 
side is positive. 

In what follows we shall assume throughout that p is integrable, and this implies that 
it is bounded. 



Sphere 


JJJ. 


-r~d,T<e 

r-r'R™ 


( 2 ) 
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We have to consider the two integrals for the potential and the intensity of field X it 

(4> 




-Zi)dr. 


( 6 ) 


These are both improper integrals. Consider a spherical region r about P and a cavity r* 
of any form, included in r and enclosing P. If we consider the contribution from the region 
r — r # and denote the upper bound of | /> | within the sphere by p m we have 


and 


ILMJ 


dr 

IP' 


(*) 


( 7 ) 


Hence from the lemma the integrals on the right can be made as small as we please, and 
those for <j> and X i converge to limits independent of the shape of the cavity. 

X i is obtained formally from <f> by differentiating under the integral sign, but to show 
that it is actually equal to d$jdx i we have to show not only that the integrals exist but also 
that, if P\x i + hlj) is a point in the neighbourhood of P, then given any positive e we can find 
h Q such that for all 0 < h < h 0 


h 


<e. 


( 8 ) 


We divide the volume of integration into (i) the interior of a sphere r t with centre P and 
radius a and (ii) the remainder of the region r 0 . We denote the contributions to the in¬ 
tegrals from these two parts by suffixes 1 and 0 respectively. Let P' be any other point 
interior to r v We shall show that (i) for any given positive e the contribution of r 1 to the 
left side of (8) can be made less than \e for any P' by taking a sufficiently small, (ii) with 
a fixed we can find h 0 such that the contribution from r 0 is also less than for all A < \ < a. 
(i) We denote distances from P' by R f . Then 

| R — R' | < h<a. 


RR 


dr. 


We have 


Hence 


1 l ( 1 M 


( 9 ) 

( 10 ) 

( 11 ) 


h 


( 12 ) 


, ii' 2 


But, from the second part of the lemma, since J* j’J 

sphere with its centre not at the origin, it is < JJJ ~ which is taken 
a sphere of the same volume with its centre at the origin. Hence 

\UP')-UP)\ 


dr is taken over the interior of a 


h 




•JJJ5- 


4 Ttyp m a. 


Also, since 


I *((**-& I <-B, 

I h x il | < YPrnjj= ^YPrnO. 


over the interior of 

(13) 

(14) 
(16) 


IW 


14 
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Hence UP 1_ —Z i X <1 < 87 ryp m a 9 ( 10 ) 

for all P' in r v G209 

Wo have to consider the two integrals for the potential and the intensity of field X it 

.<*> 


Xi= ~ r jyj & ^ _ & dr ‘ 


( 6 ) 


These are both improper integrals. Consider a spherical region r about P and a cavity r' 
of any form, included in r and enclosing P. If we consider the contribution from the region 
t — t ' and denote the upper bound of | /> | within the sphere by p m we have 


ISL'-tHfL 


dr 

B 


and 




( 6 ) 

(7) 


Hence from the lemma the integrals on the right can be made as small as we please, and 
those for <j) and X i converge to limits independent of the shape of the cavity. 

X { is obtained formally from ^ by differentiating under the integral sign, but to show 
that it is actually equal to d^>jdx i we have to show not only that the integrals exist but also 
that, if P'(Xi + hlj) is a point in the neighbourhood of P, then given any positive e we can find 
h 0 such that for all 0 < h < h 0 

* p >-* p Lj 1 xJ <4i (8) 


h 


<e. 


We divide the volume of integration into (i) the interior of a sphere t x with centre P and 
radius a and (ii) the remainder of the region r 0 . We denote the contributions to the in¬ 
tegrals from these two parts by suffixes 1 and 0 respectively. Let P' be any other point 
interior to t 4 . We shall show that (i) for any given positive e the contribution of t x to the 
left side of (8) can be made less than \e for any P' by taking a sufficiently small, (ii) with 
a fixed we can find h Q such that the contribution from r 0 is also less than \e for all h < h 6 < a. 
(i) We denote distances from P' by B'. Then 

^ n -^ m=7 !SSA^) dT=7 SSS, p? sf dT - (9) 


We have 

|P 

— R'\^h<a, 

(10) 


1 

1 / i 1 \ 



RB' 


(11) 

Hence 

HP')-UP) 

h 


(12) 

But, from the second part of the lemma, since JJJ ~ dr is taken over the interior of a 
sphere with its centre not at the origin, it is ^ f f f ^ which is taken over the interior of 

a sphere of the same volume with its centre at the origin. Hence 


i 

UP')-UP) 

h 


(13) 

Also, since 

I «*<-&) i<*. 

(14) 


1 k^a 1 < mJJ = i7 nPmO" 

(15) 


jur 


14 
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Hence 


Internal potential 

< ^ypm a > 


I UP')-UP) 7T 

£ h&-ix 


for all P' in t v Given any positive e we choose a so that 

%nyp m a<\e. 

(ii) Since P and P' are both external points to r 0 , for a given value of a, 

UP')-UP) 


6*04 

( 10 ) 

(17) 




can be made less than any given positive \e by taking h<h 0 <a, where h 0 may depend on a. 
We then have that 


#F)-0(P) 7 v ^ UP')-UP) r T . UP')-UP) J Y 

s - l f X, < j + - I ‘< x 


iO 


<e (18) 


for all h < h a 


Therefore, since e is arbitrarily small, 

0(P') = 4>(P) + l i hX i + o)h, (19) 

where 0 ) 0 with h uniformly with respect to ; hence (j) is differentiable and has derivatives 

x *-S- y /JK(s)*- (2o> 


X i} and 


Now apply Green’s lemma to the region between a small sphere 2 about P and the surface 
8 of the body. We have /0 ,. - * - a x 

§).SSSAdh 


Apply Green’s lemma to this integral; this will be justified, in the conditions so far adopted 
for the lemma, if p has integrable derivatives. In surface integrals we take the direction of 
the normal away from P; then 






dS 

B ‘ 


When the radius of 2 tends to 0 the last integral tends to 0; hence 


( 22 ) 


dXi 



7 dS 
kp-p+r 


§ 


dp dr 

WiP’ 


(23) 


the last being an improper integral through the interior of S. Hence X i is the sum of two 
potential functions, one due to a surface density —l t p over 8 , the other to density dp/d^ 
through the interior of S. Since p has been supposed to have integrable derivatives, 
dp/d^i is integrable and bounded in the region. Hence 3^/3^ in its turn is differentiable 
in the region. 

Note that the argument up to (20) assumes only that p is integrable; <f> exists and has 
first derivatives at a fini te discontinuity of p, as for instance at a free surface. The further 
condition that p has integrable derivatives is used in the transformation leading to (23). 

Note also that if a real cavity is made in the body and the distribution of p outside the 
cavity is kept unaltered, the field in the cavity will be given by and therefore is 
arbitrarily near X i if the cavity is small enough. 
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6*041 . Gauss’s theorem. Let us consider the flux of d(pldx { through a closed surface 
8. The points x i are taken to be on 8. p is a function of the £* only and does not involve x t ; 
hence d6 


if*. 


dx. 




(24) 


since inversion of the order of integration is permissible provided p is continuous and 8 
has a finite area (actually in somewhat wider conditions). Now in the integration over 8 
the £* are to be treated as constants. If a point Q with coordinates lies outside 8, then 
by Green’s lemma 

JJ ^ U) dS -J JJ V is) < 26 > 


taken through the interior of 8. If Q is outside 8, V 2 (ljR) — 0 at all points within S and 
the integral is zero. If Q is inside 8, we have, as in 6-01 (7), 

/Kits)"--*- (26) 

Hence JJ l t ^dS = — 4 Tty JJJ pdr, (27) 

where the range of integration for r is through the interior of 8. This extends Gauss’s 
theorem to continuous distributions. 


6*042. First treatment of Poisson’s equation. Gauss’s theorem provides an easy, 
though not satisfactory, way of getting Poisson’s equation. The last integral in (27) is 
unaltered if we now replace by x € everywhere, including p; and the first integral is 

jjjV 2 ^f>dx 1 dx 2 dx 3 . (28) 

Thus JJJ V 2 <f>dx 1 dx 2 dx 3 = — 47ryJJJ pdx x dx 3 dx z . (29) 

This must be true for every region, and therefore 

V 2 ^ = — 47 ryp (30) 

almost everywhere. This is Poisson's equation, and gives the appropriate modification 
of Laplace’s equation inside matter. 

6-043. Second treatment of Poisson’s equation. This is one way of getting 
Poisson’s equation, but is not altogether satisfactory for three reasons, (i) The use of 
Green’s lemma to derive (28) from (27) has been completely justified only if dj>j'bx i have 
integrable derivatives with regard to all the coordinates; we have proved that the deri¬ 
vatives d 2 ^jdxi dx k exist in certain conditions, but they might not be integrable. (ii) There is 
also a complication from points Q actually on 8, but this is easily treated. For in that case 
the integral in (26) is easily seen to be bounded, though not equal to either 0 or — 47 t; and 
since the total volume of the points on $ is 0 it does not affect the right of (27). (iii) For one 

variable, if for all # of a range I f(t) dt — I g(t) dt, it follows only that f(x) = g{x) almost 

everywhere; they might differ at any set of values of x capable of being enclosed in ranges 
of arbitrary small total length. The proof that f(x) = g(x) everywhere is easily completed 


14-2 
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Poisson’s equation 

if it is known that f(x) and g(x) are continuous, but in the present case it has not been 
proved that V 2 0 is continuous even if p is. 

The following treatment starts from 6-04(23). If we consider the contributions to 
dtftjdxi from the exterior of a small sphere of radius a it is clear from 6-04 (21) that V 2 0 o = 0. 
For the contributions from the interior of 2 we can apply 6-04 (23) to the interior of 2; then 



(31) 


In this the second integral tends to zero with a. For the first, if we put i = k and sum, 

y JW(s)"WJ&"- (32) 

Since p is continuous at P the integral tends to 477 /> p as a-*0; hence 


V 2 ^ = — 4c7Typ. 


(33) 


We have proved this under the (sufficient) condition that p has integrable derivatives. 
This condition is not necessary. Consider a heterogeneous sphere of radius a, with p 
a function of r. The potential at an internal point is 


4 ; T y [r ra 

6 = —- pa 2 da + 47ry pada, 
r Jo Jr 

'tr = 

- - - M J + $ x i pr2 ) • 

V 2 ^ = — 47ryp. 


d(f> 

dxi 


dx t dx 



with no restriction on p except that it is continuous at the point considered and integrable 
for 0 ^ r < a. (See also Note 6*043 a.) 


6*05. Surface distributions. Surface density. Let us suppose that over a surface 
8 there is a bounded concentration cr per unit area; then the potential at P is 

* = r JJs dS - (1) 

If P is on the surface this is to be interpreted as an improper integral by excluding the 
interior of a curve C on the surface about P and then making G shrink up to P. Evidently 
Laplace’s equation is satisfied at all points not on 8. 

Take a point O on S, and the axis of x 3 so that at every point of S within a certain non¬ 
zero distance from O, except possibly at O itself, there is a normal to 8 making an angle 
with the x z axis differing from a right angle by more than some fixed amount. If $ is a 
smooth surface we need only take the axis of x 3 to be the normal at O. If S has a conical 
point at O (not a cusp) we can take the axis of x z within the angle. Then we can take a 
ourve G about O so that <r/i 3 is bounded within C, where l z is the third direction cosine 
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of the normal. Then if P is (0,0, x 3 ) and we take a closed curve c on 8 so that c is entirely 
within G the contribution to <f> from the part of the surface between c and G is 


ill 


,{£!+£!+(k-^) 2 }* 

over the part stated. If we take cylindrical coordinates m, A 


( 2 ) 


| ° r /^3 I < (^) 

£i+£! + (£3-*3) 2 > ot2 > ( 4 ) 

and (2) has modulus < yK JJdtzrdA < 27 ryaK, (6) 

where a is the greatest value of w on G. This is arbitrarily small, and therefore even for 
x 3 = 0 the improper integral has the same value irrespective of the limiting form of C. 
Hence <f) has a definite value even if P is on S. 

Now take G so that 4 nyaK < \e. Then take x 3 so small that the contributions to <f> at 
P and O, say <J> and f> 0 , from the part of S outside G differ by less than £e; then 

I 0 — I < 4ny<*K + \e < e. (6) 

Hence <}> approaches <j) Q continuously when P approaches O from either side. 

<f> is also continuous on S. For if P is on 8, within G , and at a distance r from O, the 
largest distance of P from any point on G is ^ a + r ^ 2a. Hence the contribution to 
10 ~ $o I fr 0 ** 1 the interior of G is < QnyaK, by (5). We take a so that this is < \e and then 
take r so small that the contributions from parts outside C differ by less than \e. Then 
\<f> — <j>o\<G, and we can find a region on 8 about O such that | ^ — | < e at all points 

of it. Hence <j> is continuous everywhere. 

If 8 has an edge passing through O, or if a finite number of edges meet at O, the results 
follow on applying the argument to each face separately and adding. 

The results still follow for a cusp if cr is bounded. These cover the cases possible for 
surfaces not intersected an infinite number of times by any straight line. 

Referring back to 6*04 (23) we see that at a simple discontinuity of a volume density 
d<f>jdx i is continuous. For 30/9a^ can be expressed as the sum of two potential functions, 
one due to a volume density and the other to a surface density. But both these potentials 
are continuous, by what we have just proved. 

The normal gradient of <f> is discontinuous on S. If dn is an element of the normal to 8 
measured in the same sense on both sides of S, we indicate the value of d/dn on the side 
where dn is positive as we recede from 8 by suffix 0, and on the other side by suffix 1. We 
show that in suitable conditions 



This result is found most easily by using Gauss’s theorem, but the argument is unsatis¬ 
factory for similar reasons to those given with regard to the derivation of Poisson’s 
equation. Direct study of the expression for dftjdn is more satisfactory. We take a small 
piece of the tangent plane at a point 0 of the surface as a standard of comparison. We take 
O as origin and the tangent plane there as x 3 = 0. We assume that the surface has finite 
curvatures at and near O. Take C to be such that all points of it are at distance a from the 
axis of x 3 , and consider the contribution to d<fjdx 3 at points on the axis of x 3 from the part 
of 8 within C. This is 


% 3~ X 3 


i3~ X 3 
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for < a 2 , where r = crjl z . We have to consider the error introduced by taking £ g = 0. 
For any function whose derivative exists 

/(&)«/(0) + &P, 


where p = /'(0£ 3 ), 0 < 6 < 1. But 


^ £3 X 3 


1 3(6,-*,)* 

dg z B 3 


B 3 B 5 


< 


B 3 


and 


where 




&z~ x z 

B 3 

£!+£!+*§ = 

M|<1, 


^3 . 2 A £ 3 
1*8 


P? * 

w 2 + x l, 


(9) 

( 10 ) 

( 11 ) 

( 12 ) 

(13) 

(14) 


and B 1 lies between R 0 and B. Then 

?SI /-^ ds -- r /» jTi wdwdx+r JX 2Ar lt dmdX - 

In the second integral on the right, B x ^ w, and since the surface is assumed to have a 
finite curvature £ 3 = 0(ru 2 ). The integral is therefore small of order a. For any e we can 
therefore choose a so that the second integral is less than f e. 

Again we have wdm = B 0 dB 0 , since x z is taken constant; then 


/*o /*2 it 

iJo 


TX 


^wdwdX 


2 ar, 


t is supposed continuous. This is equal to 

— 27ry7j 


/•Vto’+xj) r 2 
J I Xt | Jo 


J- 


-B8 


d.R>Q dX .; 


(15) 


(16) 


I V^+^Di 

where t x lies between the upper and lower bounds of r within (7. Since 7 is continuous we 
can put t = cr + 7 ), where cr is evaluated at O and equal to 7 there, and i}-> 0 with a. Also 
the modulus of the terms in brackets is < 1. Hence if a is small enough (15) differs from 

— 27ry<r(| 


.by less than 4e. Next, by taking 8 sufficiently small compared 

IK| V( a2+a; i)) 

with a we can make the second term in the brackets as small as we like, say < £e, for 
all | x z | < 8. In all (8) will differ from - 2-ny(rx z j\ x z | by less than fe. It therefore changes 
discontinuously by an amount arbitrarily near to — 47ry<r when x z increases through 0. 

Further, since the contribution to d<f>jdx z from the part of S outside C is continuous, 
we can choose x z so that its values at 0 and P differ by less than $e. Then for values of x z 
such that | x z | < 8, but x z has opposite signs, the difference between the values of d<fi]dx z 
on the two sides of S differs from - ^nya by less than e. 

Here a has to be chosen first, so as to make the differences arising in (14) and (15) from 
the variations of £ 3 and 7 within C each < |e; then x z is chosen so that those arising from 
the second term in (16) and from 7 - <7 are <£e, and the contribution from the parts of 
S ohtside C differs by less than £e from its value at 0; thus we show that d<p/dx 3 tends to a 
limit as a? 3 ->0 from either side, the difference between it and its limiting value on that 
side being less than \e\ and finally the difference between the limiting values on the two 
sides differs from - 47 ry <7 by less than e, which is arbitrarily small. 

The argument for the tangential derivatives proceeds similarly. We have 






B 3 


( 17 ) 
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Doublet shell 


The only serious modification of the argument is in the treatment of the integral over 
the circle in the tangent plane. The integrand here is an even function of x 3 and the 
integral must therefore have the same limit when £C 3 ->0 from either side if it has a limit 
at all. Now 


in which both terms are potential functions. Hence if r is differentiable d<j>ldx x and d<f>/dx t 
exist at all points of 03 and are continuous. 

These various results require conditions of different stringency on the form of the 
surface and the distribution of surface density. Continuity of 0 is satisfied if cr is bounded 
and integrable and S meets no given line more than a fixed number of times. The 
normal gradient has discontinuity — Anycr if cr is continuous and 8 has finite curvatures. 
The tangential derivatives of <j> are continuous if <r has integrable derivatives and 8 has 
finite curvatures. Some of these conditions can be somewhat relaxed, but the cases 
where they are not satisfied and the theorems remain true are rare. 


6*06. Doublet shell. We have already had 

(j> = y jj/idct). 

If, as before, we take O on the surface and a small circuit C around it, the contribution 
to <f> from the part of S outside C is continuous. For the part within C, if is continuous, 
the contribution can be made as near as we like to y/i 0 jjdco. But this increases by 4iry/i 0 
when the point considered passes through O. 

There is a relation between this result and the discontinuity in d<p/dn for a surface 
density. If we displace the whole surface by —dn parallel to 03, the change in (j) will be 
to the first order in dn equal to that of a distribution of doublets of moment density — crdn 
with axes parallel to the x 3 axis, and for points close to this axis the axes will be nearly 
normal to the surface. This relation can be presented formally in various ways so as to 
show directly that the discontinuity in d<fi/dn due to surface density cr is the same as that 
in ^ due to doublet density — <r, but the partial integrations are tricky. 


6*07. Uniqueness theorem for solutions of Laplace’s equation. Let <J> and <j>' 
be two different functions satisfying Laplace’s equation within a closed surface 8 and 
having continuous first and second derivatives. Then their difference <f>” also satisfies it. 
But by Green’s theorem 




dr 


-// 


on 


( 1 ) 


Now if either <j> and or d^/dn and d<j>'jdn, are equal at all points of S, this integral 
vanishes. But the integral on the left is not less than 0 and can vanish only if 



( 2 ) 


at all points within 8. Hence <}>" is a constant and will be 0 if it is 0 on 8. Hence (1) if ^ 
is given over S, it is uniquely determined inside 8, (2) if d(f)jdn is given over 8,<pis uniquely 
determined inside 8 except for an additive constant. 
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6*071. The same result is true if <f> and <f>' satisfy Laplace’s equation outside a closed 
surface S, provided that they decrease sufficiently rapidly at large distances. Take a 
sphere of large radius A surrounding 8 and apply (1) to the region between 8 and this 
sphere, dv on 8 will now be towards the interior of 8. Now let <fi, <j>' be given to tend to 0 
at large distances and d<f>/dx it d<p'Idx t to be 0(1 jr 2 ) when r is large. Then <f> and 0' are 
0(l/r). Then the integral over the large sphere is 0(1/A) and decreases indefinitely as 
A —>oo. We therefore have 

" ,d<f> 




where the integration on the left is through all space outside of 8. Hence the integral on 
the left vanishes if <f> = <}>' or d<f>/dn = d^'jdn at all points of 8 ; then d(f>" jdx i = 0 outside 8. 
In this case <f>" -> 0 at infinity and the difference must be zero at all points. Thus if <j> or 
d<p/dn is given at all points of 8 and satisfies Laplace’s equation outside it, while and 
d<j>jdx i — 0(1/r 2 ) when r->ao, (j> is uniquely determined at all points outside 8. 

The theorems remain true if <f> is given over part of 8 and d</>/dn is given over the re¬ 
mainder. For, for min g as before, the integrals over 8 still vanish. 

6*072. Minimal theorems. Let (f> satisfy Laplace’s equation within 8 and let <f>' be 
a function not satisfying Laplace’s equation but such that (}>' = <j> at all points of 8. Put 
Then 


dS = 0. 


0) 


Hence 




The solution of Laplace’s equation therefore makes 


Jill)' 


( 2 ) 


dr a minimum subject to 


the given boundary conditions. The extension to the region outside S, subject to the same 
restrictions as in the last theorem about the behaviour of <f>, <j>' for large r, is simple. 

6*073. A related theorem due to Kelvin is as follows. <j> is taken to satisfy V 2 0 = 0 
within S, and is a solenoidal vector such that l t u t = d<p[dn over 8. Put 

d6 

U * = m i +Vi - 

Then -0 on S; ^ = °. 

SSI{ u HS*-*} dT=2 SSS v <i dT 
- 2 /JJ {§ 

= 2$f$l i v i d8 = 0, 


( 3 ) 
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Uniqueness theorem in two dimensions 

u ‘ dT> !!M)‘ dT - (4) 

Thus Jjju^dr is a minimum , given that u t is a solenoidal vector within 8 and has a given 
normal component on S, if u t is the gradient of a solution of V 2 ^ = 0. 

These theorems prove that subject to the given conditions there cannot be more than 
one solution. The proof that one solution exists is difficult.* 

6*074. Uniqueness theorem in two dimensions. The above theorems apply 
equally well to two-dimensional problems for the interiors of closed curves and 
cylinders. 

Modification is needed for external problems because in two important cases the 
vector u t in two dimensions is 0(1/r), not 0(1/r 2 ), when r' is large. For a cylinder with a 
finite charge per unit length the potential at a large distance behaves like logr. For two 
plates extending to infinity with asymptotes inclined at Q x and d 2 to the x axis, and with 
different limiting values of <f> on them at large distances, the conditions will be satisfied 
if, for large r, <J> = A + Bd, which is a solution of Laplace’s equation. B must be zero 
for the exterior of a closed curve to make 0 single-valued; but it can be non-zero if the 
boundary is such that a complete circuit about the origin is impossible for large r without 
crossing the boundary. In both cases u t will be 0(l/r). But then Jfu$dS diverges since 
dS = rdrdd. It therefore becomes meaningless to say that if we alter u t we shall increase 
this integral. The same applies if the boundary, which we shall now call C, goes to infinity 
along two curves with parallel asymptotes, since if tends to different limits on them the 
normal gradient tends to a constant and again the integral diverges. The minimal theorems 
therefore do not hold in two dimensions unless there is some further restriction on the 
behaviour of u i at large distances. 

If, however, we take the extra conditions in the uniqueness theorem to be, for a closed 
curve, that a, 6 exist so that for large r 

T T 

<f>—a log^->0, <f> f — <zlog^->0, 

= 0(1/r 2 ), !^'-alog^ = 0(l/r 2 ), 

we have 0' = 0(1/r), = 0(1 /r a ), 

and for a large circle = 0(l/r 2 ). 

Hence the uniqueness theorem still holds if the behaviour of 0 and (j)' at large distances is 
suitably restricted. 

Evidently, from Green’s theorem, f(d<f>/dn) ds has the same value for C and any closed 
curve enclosing it. This integral can be taken as a datum in many problems, since in 
electrostatics it is related directly to the total charge and in hydrodynamics to the total 
rate of outflow, and we can restrict ourselves to variations that preserve these. But if 
this integral is finite and not zero the radial component must tend to zero like 1/r, and 
the potential will behave like logr, with a given coefficient. 

* O. D. Kellogg, Foundations of Potential Theory , 236 and 277-328. 


and therefore 


Iff 
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If C goes to infinity in two directions we can take <f>, <J>' to behave like ad + b for large r, 
a and b being chosen so as to fit the limiting values of <j> on G in the two directions. If 
we* take 

<}> — a0 — b — o(l), <}>' — ad — b = o(l), 

^(<j> — ad — b) = o(l/r), ^-(fi-ad-b) = o(l/r), 

then 

for a large arc, and the uniqueness theorem holds. This applies equally well to a semi¬ 
infinite plate, for which we can take the limiting values of 0 on the two faces to be 0 and 2n. 

If C goes to infinity with two parallel asymptotes, on which <f> has limits, we can take 
the external boundary to be a curve normal to both. A sufficient extra condition to ensure 
uniqueness is now that jdn 0 on any curve cutting the two asymptotes normally at 
a large distance. 

6*08. The Rayleigh -Ritz method. Minimal theorems are the basis of a useful 
method of numerical solution used by Rayleigh and Ritz and justified by Krylov. The 
usual method of solution would be to solve the partial differential equation and then 
combine solutions so as to satisfy the boundary conditions. But when the differential 
equation is equivalent to the principle that a quadratic form is stationary it is possible to 
choose a function f 0 that satisfies the boundary conditions and a set of functions ... 

that contribute nothing to the boundary values; then 

4>' = fo + a ifi + a ih + • • •» 

satisfies the boundary conditions. But if, as here, the correct solution makes JJJj dr 

a minimum we can substitute <j>' for 0, evaluate the various integrals numerically, and 
then determine a v a 2 , ... so as to make the resulting expression a minimum. If the func¬ 
tions f r are such that any twice differentiable function can be expressed in terms of them 
the solution is theoretically complete, and can, with sufficient effort, give accurate 
numerical solutions where the formal solution of the differential equation is too com¬ 
plicated. An extension due to Richardson and Southwell does not even require the 
functions f r to be given explicitly. In this method a rectangular network of points is taken 
at sufficiently close intervals, and the integral to be minimized is expressed directly in 
terms of the values at these points, using centred finite differences. The values of the 
function are then solved for directly by successive approximation. The method is 
laborious, but, as Southwell remarks, will always work if the computer will. Whether 
it is more laborious than tabulating solutions of a partial differential equation depends 
on special circumstances. 

6*09. Green’s equivalent stratum. Let $ satisfy Laplace’s equation within a 
closed surface 8, and let R be the distance of the point Q(£ { ) from P(x { ). If P is within 8, 
take a small sphere <r about P and apply Green’s theorem to the region between or and 8. 
If P is not within 8, apply the theorem to the interior of S directly. In either case 
V 2 (l/P) = 0 within the region used, and we have for P within 8 


(I) 
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where dv is normally outwards from the region towards P, and therefore equal to — dR. 
For P not within S the two integrals over cr do not arise, and we have simply 

being the derivative approaching 8 from the inside. 

For P within 8, and cr of radius a, which is arbitrarily small. 



JJ ■* 

(3) 


l!ify d(r=o [Sl ad,a h o - 

(4) 

Thus 


(5) 


This equation is a three-dimensional analogue of the theorem (11*13) in the theory of 
functions of a complex variable 

< e > 


when/(z) is analytic within G and z is within C. It suffices to determine <j> at all points 
within a surface given 0 and its normal derivative on the surface. The term in 1/P is the 
potential due to a surface density on 8, that in 3(1 jR)jdn that of a doublet shell on 8. 
But these distributions cannot be assigned independently; for we see from (2) that if 
P' is a point external to S and R' the distance QP', there is a relation 




(7) 


for every such position of P\ We should expect this since we know that if there is a solu¬ 
tion at all, the values of either 0 or d<pjdn over the boundary are sufficient to determine it, 
and therefore either will determine the other, apart possibly from an additive constant. 
The complex variable analogue is that either the real or the imaginary part of a function 
f(z) over a closed contour will determine the other, apart from an additive constant, 
subject to f(z) being analytic within the contour. 

With a suitable restriction on 0 at a large distance the result can be adapted to deter¬ 
mine 0 outside a surface, given <f> and 30/3 n on the surface. For suppose that 0 satisfies 
V 2 0 = 0 outside 8 and tends to 0 at a large distance like 1/r, r being the distance from a 
fixed point within S. P is outside 8; we draw a large sphere E enclosing S and P, and a 
small sphere cr about P as before, and apply the theorem to the region between 8, cr, andE. 

Wa hfljVB 


where dv is still taken outwards from the region used and therefore, on 8, is from the 
outside. On E, <j> and 1 /R are of order 1/r, and 30/3v and 3(1 JR) dv of order 1/r 2 . Hence the 
integral over E is of order JJd E /r 3 and tends to 0 when r->oo. Taking dn outwards from 8, 




(9) 
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Comparing this with (2) we see that they can be consistent only if either <j> or d<fijdn is 
discontinuous on crossing 8, and by subtraction 


^ - //[‘A-WafS-s©).-©]]"- ,10) 


suffixes 0 and 1 indicating limits on approaching 8 from the outside and inside respec¬ 
tively. If (j> is continuous across S, 


If d<pldn is continuous, 


(H) 

( 12 ) 


The field can therefore be represented in terms of a distribution either of charge or of 
doublets over 8. Consequently all theorems derived from the integral definition of a 
potential are true given only that the function satisfies Laplace’s equation in the region. 

Unfortunately, direct application of these theorems to find the internal or external 
field is seldom possible. The integrand in each of the equations, given either <j> or d<f>/dn 
over 8, involves the other, and to find the latter usually involves the complete solution 
of the problem, and the potential at P will be found in the process. It can, however, be 
carried out in the important special cases of a sphere, a circle, and a plane. The case of 
the circle is treated in Chapter 14. 


6*091. Solution for a sphere: external point. Let <j> be given over a sphere of 
radius a, P an external point at distance r, and P' its inverse point in the sphere. Let Q 
be any point on the surface and take PQ — R, P'Q — R'. Then 

(i > 

and (2) 

because P' is an internal point and the value taken 
for dfijdn at Q is the external value. But on the sphere 


Hence we can eliminate {d<j>jdn) Q \ we get 



4 ? T<pp=jj<f> Q | 


d_ 

dn 



( 4 ) 


Denote the angle POQ by #. Then 

7 ) 1 

R z = a 2 + r 2 — 2 ar cos#, -=r 

on R 


11 

da R 


a — r cos# 


<«) 


and similarly (keeping P' fixed in the differentiation) 
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Hence, on substituting and simplifying, 

^=j\^ {r2 - aZ)dS - 
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(7) 


6*092. Internal point. Similarly, we find for the potential at the internal point P', 
where OP' — r', 


rr a 2 — r ' 2 

4^ = jj0-^S. 


( 8 ) 


This problem, solved by Green, is sometimes called the first boundary potential pro¬ 
blem for a sphere. The second problem is the determination of the field given the normal 
gradient over a sphere. It can be solved by noticing that if <j> is a potential function, 
rd<j>jdr is another, and the information supplied gives its surface value. The third problem 
is the one that occurs in the study of gravity, where the level surfaces are approximately 
spherical but df>jdn is observed, not on a sphere, but over a surface where <f> is constant. 
It is found that, to the first order of small departures from spherical symmetry, this 

information is equivalent to the values of -ff- + 2- over the sphere of equal volume. But 
9 ^ r 

(r 2 <fi) is a potential function and the information determines it over the sphere. Its 

values outside the sphere are therefore determined and those of <j> are found by an integra¬ 
tion under the integral sign with regard to r. The problem was solved by this method by 
Idelson and Malkin; the original solution was by Stokes, using spherical harmonics. Cf. 
24-114. 

In (8) we notice that if P' is at O, r' = 0, R‘ — a for all Q, and 


4tt0 o 


hli***- 


Thus the mean value of a potential function over a sphere is equal to its value at the centre. 
It follows that a function cannot have a maximum or a minimum at any internal point 
of a region where it satisfies Laplace’s equation. The extreme values must be taken on 
the boundary. 

6*093. Solution for a plane. If we make the radius of the sphere very large we 
approach in the limit the solution of the corresponding problem to Green’s for a plane. 
If z is the normal distance of P from the plane, 




zdS. 


( 1 ) 


It can be verified that the solutions obtained satisfy V 2 f> = 0 and tend to the proper value 
on the boundary; the method* is similar to that used in 6-091 and 6-092. Also if 




( 2 ) 


the solution has a derivative tending to the proper value on the boundary. The last result 
is of course obvious, since the surface density corresponding to a derivative ( d<f>/dz) over 

a plane is 

2iry oz 

* Given in full by Poinear6, Th&orie du Potential Newtonien, 1899, pp. 183-91. 
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6*10. Potential and field in a polarized medium. Consider a finite region where 
there is no total electric charge, but the polarization at a point Q is P; that is, if dr is a 
small element of volume about Q(^) the dipole moment of the element is Pdr. Consider 
first the potential and field at P(x { ) a point outside the region. Then 


Since 




( 1 ) 

( 2 ) 

(3) 


<f> is the same as the potential due to a distribution of charge of density — div P through 
the region and one of surface density the normal component of P, over the boundary 

of the region. 

As for a continuous distribution of charge, when we wish to find the potential and field 
at a point P inside the region we must first suppose a small cavity made about P and study 
the behaviour of the integrals when the size of the cavity tends to zero.* But now the 
improper integrals in (2) depend on the shape of the cavity. Those in (3) do not. We denote 
the outer boundary by 8 , that of the cavity by S, the region inside 8 by V and that inside 
2 by v, and therefore the region between them by V—v. Then the potential <j> is given by 




d2- 


iimy JTf 

v —>■ 0 J J J V—v 


BdZi 


dr. 


(4) 


Provided P t is bounded the second integral tends to zero, and 



( 0 ) 


which is the same as the potential due to a surface distribution l i P i over S and a volume 
distribution — divP through V. Then if divP satisfies the conditions imposed on p in 
6*04 we know by the considerations there that we may differentiate (5) under the integral 
sign; then 

-i - <•> 


We denote this by E { and call it the electric intensity; it also is independent of the limiting 
shape of the cavity. 

Again, E i is the field due to the surface distribution Z t P- over S and the volume dis¬ 
tribution — divP through F. As we shall see, it is not in general the same as the limit of 
the field in the cavity. It follows from Poisson’s equation that 


divP = —V 2 ^ = —47ry divP, (7) 

or div (E + 47ryP) = 0. (8) 

If in addition to the distribution of dipoles there was a charge density p there would 
be additional terms in <j> and E, and (8) would be replaced by 

div (E + 47ryP) = 4aryp. (0) 

* P outside the cavity remaining unaltered. 
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This is physically possible. We could make a hole in a block of paraffin wax, create a 
charge on its interior by friction, and then fill up the hole again, thus leaving a charge 
density in the interior of a solid. The vector E + 4 nyP is called the displacement or in¬ 
duction and denoted by D. 

Now consider the field inside v. Since P is an external point of the region V — v the field is 


F > - -S^r iT ' <10 > 


which is equal, if we make the cavity tend to zero, to 


£ i + Um yjjl k P k *J*dV. 


( 11 ) 


If we take the cavity to be a circular cylinder of length 2 a and radius b with its centre 
at Q and its axis in the direction of P, then l k P k at A is P and at B is — P (the normal 
being into the cavity) and is zero over the sides, apart from small effects due to variation 
of P in the neighbourhood. Then the field at Q due to the surface distribution over the 
walls of the cavity is 


47ryP 11 


)■ 



\ (a 2 + 6 2 ) 1/a 

Hence if as a and 6 become small, b/a-^0 this contribution to the 
field is zero. If a/6 ->0 the contribution is 4 nyP. Hence E as defined 
by (5) is the limit of the field in a needle-shaped cavity with its axis 
in the direction of P, and D (= E + 47ryP) is the limit of the field in 
a coin-shaped cavity with its axis in the direction of P. 

For a spherical cavity the surface density on the wall is — P cos d and hence the field 
at the centre due to it is %nyP. 

Suppose now that there is a relation 

P = kE, (13) 

so that D = E{ 1 + 4tTiyK) — KE, (14) 

where k, and similarly K, are continuous scalar functions of position. Then k is called the 
electric susceptibility and K the dielectric constant of the material, which is isotropic. 
If the relation between D and E has the form 


Di = K ik E k , (15) 

where K ik is a tensor that is not a multiple of 8 ik , the medium is anisotropic. 

We consider now the behaviour of D and E at the boundary of two uniform media of 
different dielectric constants K v K 2 . The intensity E at any point is that due to a volume 
distribution divP and a surface distribution P ln — P 2n , where n denotes the component in 
the direction of the normal drawn from the medium 1 to the medium 2. Owing to the 
surface distribution there is a discontinuity in the normal component of E, namely, 


E Zn~ E ln = ^yiPm-Pzn)- 

Hence the normal component of D has the same value on both sides of the boundary. 

The corresponding theory for magnetism is similar. The differences are (1) pis always 0, 
(2) k and K (now called the permeability p) may vary greatly with the magnetic intensity 
H, (3) there are permanent magnets, with fixed intensity of magnetization, so that we 
can write B = H+ 4nryl, but there is no relation between I and H . 
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6*11. Vector potential. Let u be a solenoidal vector. Then we can show that in general 
VA - Cuv' A there is another vector A such that u is its curl. It is convenient first to call the com¬ 
ponents u , v, w and A, B, C; then we have to satisfy 

dC 


Take 


_ dB _ _ M_M 

U dy dz' V dz dx ' W dx dy 

C x = 0, A x = j vdz, B x = — j udz, 

J z, J z. 


( 1 ) 

( 2 ) 


where z 0 is independent of x , y and z, and the path of integration is parallel to the axis of «. 
Then the first two equations are satisfied. But 


^_^ = _P(3“ + ^U = P 

m Jz . 


since 


3w 3v> 
3a; 3yJ 
= w-w(x,y,z 0 ). 

3 u dv dw 
dx + dy~*~ dz 


dw 

dz 


dz 


U. h 

* I f et*. >' 


o A 


ml &/■ 


o 


(3) 

(4) 


In general w(x, y, z 0 ) will not be zero, but it depends on x, y only and can be denoted by w 0 . 
Now put 

A = A 1 + A i , etc.; (6) 


then 


Take 


a£ 2 _aB 2 _ ft 

dy dz ’ 


a4_ 2 _3^ = 

3z 3a; ’ 


B z = 0, C % — 0, A s 


dB z dA s 
3a; dy 
rv 

= - w Q dy. 
Jv* 


= w n 


( 6 ) 


Then all the equations are satisfied. There is still a considerable arbitrariness in the 
solution; for we could add to (A,B,G) the gradient of any scalar whatever without 
affecting its curl. Conversely we can impose an additional condition on (A, B, C) to make 
its divergence anything we like, in particular zero. For we need only add the gradient of 
a scalar <D such that V 2 <1> cancels the divergence of the solution we have already found; 
and given suitable boundary conditions <D will be determinate. Hence, translating into 
tensor notation, if 


a vector A t exists satisfying 


^=0 
dx, ’ 


■J. L 




H i 


u, = e,- 


3 A. 


Mi 

dx. 


0, 


( 8 ) 


( 9 ) 


dx k ' 

and A { is unique subject to the same sort of boundary conditions as ensure uniqueness of 
solution in a potential problem. A is called the vector potential. o-|- va ; u. cwv^ A 

6*111. The above method is very unsymmetrical in the coordinates. A symmetrical 
solution can be found in some cases as follows, and is useful when curlu is given every¬ 
where. Let 


Hkm 


dUr, 

dx. 


= ( 0 , 


= 0 
dx, ' 


Mi 

U i ~ 6 ^ s BX„ 


dA, 

dx. 


= 0. 


( 10 ) 

( 11 ) 


and assume 
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Then 

du m 3 2 A 8 

€ikm dx k ~ eikm€mpa dx k dx p 

3 2 J 

- (8 ip S,„ ^h„) dXkd 
_ VA, VAt 

dx { dx s dx k *' 




(12) 

Therefore the conditions 

are satisfied if 




(13) 


through all space and if A { has zero divergence. But applying Green’s theorem to the 
region between a small sphere about x { and a large sphere 



The second integral vanishes since (o t is the curl of a vector. The first taken over the inner 
sphere tends to 0 if is bounded near ^ = x i} and over the larger sphere also tends to 0 
if to t tends to 0 suitably at large distances. Hence (13) gives a solution of the problem. It 
is still not unique because to A i we could add the gradient of any solution of Laplace’s 
equation without affecting the relations (11). 

This method fails if u t is irrotational. For then (o t is 0 everywhere and (13) vanishes. 
Its chief use is when ot i = 0 outside a filament of very small cross-section and is large 
inside it. This case arises in vortex motion and in the magnetic field due to an electric 
current. In either case the integral 

Q = j^Uidx^ (15) 


taken around a closed circuit, is zero if the circuit can be filled up by a cap not cutting the 
filament or wire, and has the same value for all circuits that can be filled up by caps 
that cut through it once. If we consider the contribution to A t from an element between 
two planes separated by d£ it and call the element of surface in a plane parallel to them dS, 
we have 

dr = d&dS, 


by Stokes’s theorem; hence 


jjo^idS = juid^ = Q 

Q f dU 

nJH’ 


taken along the length of the filament. Also 


Cl 3 

u i e ikm 4jTd 


-.a 


dL 


— f P X tc) 

An J ikm M 3 d ’ 


ci r <&-x)Adt 

An] R 3 


(16) 

(17) 

(18) 


(19) 


where ds is an element of length of the filament and l m a direction cosine of the tangent. 


J MP 


15 
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In hydrodynamics Q is the circulation about the filament. In electromagnetism it is 
4nJlc, where J is the current in electrostatic units, and % is the magnetic field. 

6*112. Vector potentials of point charge and doublet. Take a doublet along the 
axis of z, so that 

u — /jgrad (z/r 3 ). 


Then in 6*11 (2) we take the lower limit for z to be — oo; and 


*•3 » 




and from 6*11 (7) A z = 0, 


smce w 


tends to 0 when z-> —oo. Then a solution is 0^, and the divergence 

of this is zero. Therefore, if u t is the gradient of a potential /i k x k /r 3 , it is also the curl of a 
vector potential 

, _ Pk X m 

~ e ikm „3-15 • 


A similar treatment applied to the elementary vector grad(l/r) shows that it is the 
curl of the vector 

x 


-iH- S(‘4 •)• 


The part {—yjw 2 , x/tz7 2 , 0) is the gradient of the scalar tan -1 yjx and is irrelevant; so we 
can take 


M ( yz xz 

A = -2* 

\ rm 2 rw l 



This is still very unsymmetrical; partial symmetry can be given to it by taking its mean 
with the two other vectors obtained by cyclic interchange of the coordinates, but at the 
cost of producing lines of singularity along three coordinate axes instead of one. General¬ 
ization by rotation of axes merely increases the complexity. 


EXAMPLES 

1. If the attraction between two elements of mass m, m' is mm'f(R), where R is the distance 
between them, show that the inward acceleration due to a uniform thin spherical shell of mass M and 
internal radius a, at an internal point distant z (x< a) from the centre, is F, where 

M C a+X 

F = ~~j ( R* + x * - a*)f(R) dR. 

J a—aj 

If F is zero for all values of a and x, subject to the condition x<a, show that the only possible form of 
f(R) is A/R 2 , where A is a constant. 

2. Assu ming that at any point the magnetic potential of a body magnetized to intensity J is the 
same as that of a volume distribution of magnetic poles of density — divJ together with a surface 
distribution n t I 0 where n f are the direction cosines of the outward normal, prove that (i) the field 
inside a sphere permanently magnetized to intensity J is of uniform intensity — fwl, (ii) the field 
vanishes in a spherical cavity (not necessarily concentric) made in this sphere, (iii) if a plane lamina of 




* ■ 
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uniform thinlm ftaa and infini te extent is permanently magnetized to uniform intensity J the field in 
a spherical hole made in the lamina is of intensity %nl — 4^(1. n) n, where n is normal to one of the 
plane faces of the lamina. (M.T. 1939.) 

3. 0 is the centre of a uniform cube of mass M, and P is an external point. The distance OP is 

large compared with the edge a of the cube, and the direction cosines of OP referred to three con¬ 
current edges of the cube are l, m, n. Show how to express the gravitation potential at P in a power 
series in OP -1 , and evaluate it up to the term in 0P~ 5 . (Prelim. 1936.) 

4. If ^ and <$> x satisfy the conditions (1) <j> is continuous everywhere, and has continuous second 
derivatives except on certain surfaces, (2) K d<fi/dn has a given discontinuity on crossing such surfaces, 
(3) $ tends to 0 like 1/r, and d$jdx t like l/r a when r is large; and if further div {K grad (0 X — §&,)} = 0 
except on the surfaces of discontinuity, show that 

taken through all space; and hence that, subject to the conditions given, <f> is uniquely determined 
provided K is everywhere positive. 

6. Prove that if 0 satisfies conditions (1) and (3) above, the conditions that 

shall be stationary for all small variations of <j> are that 




except on the special surfaces, and that K8<j>/dn shall be continuous on crossing such surfaces; and that 
if is everywhere positive the stationary value of V is a minimum. Show also that if K d<f>(dn has a 
prescribed discontinuity in crossing any such surface, V is still stationary provided <f> is not varied on 
that surface. 

If ^ is the electrostatic potential in a field containing dielectrics, give the physical interpretations 
of the postulates and conclusions. 


6. Show that if a is a constant vector 

curl 


( OAr \ _ 

r n ) ~ 


(2—n)o^nr(o.r) 


r" 


f*ti+8 


and hence or otherwise find an expression for the vector potential inside an infinite straight solenoid. 

(M/c, Part III, 1936.) 

7. V*Uj = 0, V 2 Wj = 0 in a closed region Z>, and u x = u t in a region D x , which is part of D. Prove 
that= M g in the whole of D. 

8. Deduce from Stokes’s theorem that 

ffids = JdS a grad 0 

by considering the projection of J <f>ds on a line of direction «. 

Deduce that the mutual potential energy of two uniform magnetic shells of strengths fi, fi’ is 

Cds.ds' 

-^JJ—• 


15-* 





Chapter 7 

OPERATIONAL METHODS 

Even Cambridge mathematicians deserve justice. 


OLIVER HEAVISIDE 

7*01. Rules of arithmetic for differential operator. In a certain sense the opera¬ 
tions of differentiation and definite integration satisfy the rules of arithmetic. For if a 
and b are constants 

+ a K x ) = a f( x ) + ^/(*)> W 

(^/(«) + «/(*)) + 6/(aj) = ^f( x ) + i a f( x ) + 6 /(«)}» ( 2 > 

s“ /w= 4 /(i) ' (3) 

<W(*)} - («»)/(*)> ( 4 > 

and if we define _ 

(Js +0 ) /(x) - &/ (x)+ ° /(x) - ( a+ s) /(x) ' (5) 

which is permissible by (1), we have the forms of the distributive law 

l(a + 6)/(*)-+ («> 

6 + = 6 + ab ^ x) ' (7) 

Thus in any algebraic combination with constants, involving only addition (and therefore 
subtraction) and multiplication, the differential operator can be manipulated as if it were 
itself a numerical constant. The function f{x) operated on remains on the right and the 
operation can be carried out on it at the end. 

7*011. Operation of definite integration. The same applies to the operation of 


definite integration. If we write Qf(t) for J /(£) 

Qf(t)+ */(*) = «/(*) + C> 

and therefore both can be denoted by 

(«+<*)/(*) = (»+«)/(!); ( 2 ) 

{<?/(/)+«/(<)}+ W) = W) + W) + b M- < 3 ) 

Q{af(tj) = »{<?/(()}, ( 4 ) 

’ e{ai/(«)} = (0») {*/(<)}. (5) 

«{(» + »)/(<)} = Qm) + QW), C 6 > 

a(Q + b)f(t) = aQf(t) + abf(t). (!) 


also 
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7*012-7*02 Properties of Q operator 

Analogous relations can be found by replacing a or b by Q and interpreting 

= Q{Qfm = 

where Qf(r) is the same function of r as Qf(t) is of t. \ 

7*012. Non-commutative property of differentiation and definite integration. 

The above properties of differentiation, in the hands of Boole, became the basis of a well- 
known method of obtaining particular integrals of linear differential equations of certain 
types. It is also the basis of a method of obtaining the formulae required in numerical 
interpolation, differentiation, and integration. The corresponding property of definite 
integration was apparently noticed first by J. Caqu6, * and led to a number of developments 
in the theory of the solution of linear differential equations in the hands of Fuchs, Peano, 
Picard, and H. F. Baker. Heaviside attended particularly to linear differential equations 
with constant coefficients, which enable the maximum use to be made of the fact that the 
operator of integration can be combined with constants exactly as if it was a number. 
Of course it does not commute with variables; Q{tf(t )} is not the same as t{Qf(t)}. But the 
reason why his methods worked can really be traced back to the method used by Picard 
in proving, for assigned conditions, the existence of solutions of differential equations. 
Unfortunately, though Heaviside noticed that the operators of differentiation and in¬ 
tegration combine with constants without restriction, he did not notice that they do not 
commute with each other. In fact 

Qj t m =//©<*£ =/(<)-/( 0 ) 

which differ by/(0). Heaviside obtained a considerable number of wrong results through 
interchanging the order of differentiation and integration, and their explanation in terms 
of this non-commutative property was first given by H. Jeffreys.-)- Heaviside was also 
not particularly interested in questions of convergence, and this fact so disturbed the pure 
mathematicians of the time that they omitted to find out in what conditions the methods 
could be justified, as in fact they can within their proper scope. For dynamical systems 
with a finite number of degrees of freedom the systematic use of the definite integration 
operator provides far more concise solutions than any other method gives, and its justi¬ 
fication is complete for such problems without needing any pure mathematics more 
advanced than was available in Heaviside’s time. 

7*02. Interpretation of Q n f(t) as a single integral. If f(t) = 1 we have by successive 
integrations 2 3 

Qi = t, Qn = L . «»!=£. (l) 

For a general function/^) we have by definition 

QM = jmdr, (2) 

Q 2 f( ( ) = ^J 0 /( T ) dT = J o £J 0 /( r ) d£. (3) 

* J.de Math., (2) 9, 1864, 185-222. f Operational Methods in Mathematical Physics, 1927. 
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The integral is over the shaded region in the figure. We change the order of integration 
and get 

Q 2 f(t) = J o f(T) dr = J o (t- r)/(r) dr. (4) 

Similarly 

Q 3 f(t) = ^| n (t - r)/(r) dr = - r)/(r) drj dg 

= ilSl (S ~ T)f{Tm ] dT = S[-ir f i T)dT ’ (6) 

and in general, by induction, 

w>=J.WV /(T)<iT - • (6) 

The same may be proved as follows. Q n f(t) is the function that vanishes at t — 0 
and has derivative equal to Q n ~ l f{t). The former condition is 
satisfied by (6). For the latter we differentiate under the £ 
integral sign with regard to t, and the result is 

< 7 > 

The integrand vanishes at the upper limit if n > 1 and therefore 
differentiation of the limits gives nothing. Hence if the result 
is right for Q n ~ x it is right for Q n \ but it is right for n = 1; 
hence it is right for all positive integer values of n. 

Thus any integral power of Q operating on a function gives a result expressible by a 
single integral, provided only that the function is integrable. 

7*03. Series of operators. These rules permit us to express as a single integral any 
sum of a finite number of terms of the form 



(a 0 +a 1 Q + a z Q 2 +...+a n Q n )f(t) 


= «„/(*)+ 

= a 0 f(t) + J* I®! + a 2 (t -t)+ — ( {^ZTjr} W dT - 


The extension to infinite series requires special justification. 

/•T 

7*031. Convergence theorem. If I fit) dt exists, and if the serie 


a n +a,z +... + a n z n +... 


converges for some value of z different from zero , then the series 

{a 0 + a x Q + ...+a n Q n + ...)f{t) 

is uniformly and absolutely convergent for O^t^T, and is equal to 


oo rt (t_ T )n-1 
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The two series (2) and (3) are equal term by term and therefore have the same sum if 
either of them converges. But since the series (1) converges for, say, z = r, then for any 


positive p less than | r | we can find a quantity M such that for all n 

I a n \p n <M. (4) 

Also since Qf(t) is bounded for O^t^T, say, there is a positive quantity C such that 

| Qf(t) | < C (0(5) 

Hence | W)| c|JW < C| t|, l«W)| < j‘c |<|dr<lC|<»|, (6) 

and in general I 0*/W I < yyi 01<” _1 1- (7) 


Hence the moduli of the terms of the series after the first are respectively less than the 
terms of the series 


C 4* a 2 1 f | + |a 3 1 1 1 2 +... + 


<™/i + W + m 


P \ 


2 p a 


+ ...+ 


1 

(^ryi 

i 

J^i)i 



( 8 ) 

(») 


But this is an exponential series and is absolutely convergent for all t. Further, the terms 
are not greater than those of the series obtained by replacing t by T, and this also is a 
convergent series of positive terms independent of t. Hence our series satisfies the M test 
for uniform convergence in the range 0 < t < T; that is, in any range of t such that Qf(t) 
is bounded. 

If/(f) is a continuous function, every term of the series is a continuous function; and 
since the sum of a uniformly convergent series of continuous functions is continuous, it 
follows that the sum of the series is a continuous function of t in the range 0 ^ t < T. For 
the same reason the summation can be carried out under the integral sign: thus 

m (k a * t^tst) dT - (io) 


The argument is still valid if the terms of (1) are required only to be bounded; also if 
f(r) dr exists only as an improper integral owing to/(r) being unbounded. 

7*032. Composition of operators. Let 

F(z) = a 0 + a 1 z + a 2 z* + ... f (1) 

G(z) = b 0 + b 1 z+b 2 z 2 + ..., (2) 


be two series that both converge for some non-zero value of | z |, say r. Then for any 
positive quantity p less than r, quantities M, N exist such that 


a n \<M/p n , \b n \<N/p n . 


(3) 
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For any z such that | z | ^ p the product series 

F(z) G(z) = a 0 b 0 + (a 0 b 1 + a 1 b 0 )z+ (a 0 & 2 + a x b x + a 2 6 0 ) 2 2 +... 

= c 0 + c 1 z + c 2 z 2 +... (4) 

is absolutely convergent. We shall prove that if F(Q) and G(Q) are the operational series 
obtained by replacing z by Q, and if O(Q) operates on a function <fi(t) satisfying the con¬ 
ditions of the last theorem, and if F(Q) then operates on the resulting function, the result 
is the same as if we replaced z by Q in the product series (4) and operated on directly. 
We have 

F(Q)G(Q)<f>(t) = F(Q)(b 0 +b 1 Q + b 2 Q*+...)<f>(t). (5) 


By the last theorem the series representing G(Q)<p(t) is an absolutely and uniformly 
convergent series, and can therefore be integrated term by term. Hence 

Q m {h + b x Q + biQ*+...)cl>{t) = b Q Q m $(t) + b 1 Q m + 1 <t>(t) + b i Q m +*<f>(t) +... (6) 

and F(Q)G(Q)<fi(t) = '£'Za m b n Q™+n<f>(t), (7) 

m n 


where the summation with regard to n is to be carried out first. But if | Q<j>{t) | < C and 
m + n > 1 




MNP tT\ m+n—1 

K (m + n— 1 )! \p) 


( 8 ) 


and the terms are less than those of a double series of positive terms. All terms of this 
series with the same m + n = Jc are equal and there are k + 1 of them, n ranging from 0 
to Tc. Hence their sum is 


MNG 


(fc+1) 

(k-l)\ 


T\k~ 1 


0 


(9) 


and the sum of this with regard to k is convergent. Hence the series in (7) is absolutely 
convergent and can be rearranged in any order without affecting its convergence or its 
sum. We can therefore collect all terms in Q m+n ; but we then have 


F(Q)G(Q)<f>(t) = (c 0 + c 1 Q + c 2 Q*+...)<f>(t), (10) 

which proves the theorem. 

7*04. First order linear differential equations. Now consider the linear differ¬ 
ential equation of the first order 

^-ccx=m (i) 


where cl is constant and x is given to be equal to x 0 at t — 0. Replace t by t and 
integrate both sides with regard to r from 0 to t. We get 

x — * 0 — clQx = Q<j>(t) (2) 

that is, (1 — clQ)x = x 0 +Q<j>(t), (3) 

and the expression on the right is a continuous function of t if <p{t) is integrable. Now 
operate on both sides of the equation with the series (l + aQ + a 2 Q 2 + ...). By the last 
theorem the result of performing the operations 1 — clQ and 1 +a,Q + a 2 # 2 +... successively 
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on x is simply x; for if we put z for Q in these series we get two series convergent for 
J z J < 1/a whose product series is 1. Hence 

x = (l + aQ + oc 2 Q 2 +...){x 0 + Qfi(t)}. (4) 

This gives a formal solution, which could be developed in a series of powers of a. But we 
know otherwise that the solution of (1) is 

x = x 0 e cci + e al j er a7 ^{r)dr. (6) 


Hence the expressions on the right of (4) and (5) are equal for all values of x 0 \ therefore 

(1 + aQ + a 2 Q 2 +...) z 0 = e«%, (6) 

(1 + aQ + a 2 Q 2 +...) Qftt) = e^Q{e~^(t)}. (7) 


We have so far given no meaning to division by an operator. But just for that reason we 
are entitled to do so now, provided that we can ensure consistency. If, for instance, we 


give a meaning to 


1 

1 —a Q 


g(t), it must be such that the operator 1 — ocQ acting on it gives 


g{t). But we have 


(1 - aQ) (1 + aQ + a 2 Q 2 + ...) g(t) = g(t) 


(8) 


by the rule for the composition of operators, for all g(t) such that the operations are 
applicable. Hence we may define 

—= l + aQ + a 2 Q 2 +..., (9) 

and write the above interpretations in the compact form 


1 = e**, 


1 —aQ 

Y^q W*) = = j^e*«-+ty(T)dT. 


(10a) 

(106) 


Similarly, if F(z) is any function of z expansible in a power series near z - 0, F(0) not 
being zero, and G(z) is the reciprocal series, we can interpret \/F(Q) to mean G(Q). That is, 
the fundamental interpretation of any operator is always its expansion in positive powers 
of Q as if Q was a constant. Operators such as Q~ n and e hlQ are not expansible in positive 
powers of Q in the sense indicated; that is, the functions z~ n and e h,e are not capable of 
being expanded in positive powers of z. Consequently we can give them no interpretation 
at present; and it turns out that in the important class of physical problems that require 
the solution of a finite number of linear differential equations with constant coefficients 
such operators do not arise. 

We have 

„ 1 1 = (l + 2a<2 + 3a J 0 s + ...)l = 1 + 2oi + |ot !! « 2 + .... (11) 

(1 —aQr 


which is not an immediately recognizable form; but 

(1-aQ ) 2 1 = (^ + 2a <? a + 3a *^ + --- + ^ n “ 1 ^”') 1 = < + a<2 + ^ 8+ " + (^ ri ) i r+ - • 
a = te* (12) 





We now see that 


Simultaneous equations 


(1 -aQ)* 1 l-aQ 1 + (l-%)* 1 ~ ( 1 +<*t)e at , 


which is the same as (11). But (1 — txQ)~ n 1 contains more and more terms in its finite 
interpretation as n increases; Q n - 1 (\ — a,Q)- n 1 does not, for 

{i %.j = l>av~ + ‘W ^-*) i 

t n ~ x ai n a 2 t n+1 
= (rc-l)! + (rc-l)! + 2!(w-l)! + '’' 


(n- 1)! ’ 

on interpreting the separate terms by 7*02 (1). Similarly, by expansion, 


(1 -<*<?)’ 


-JX 3 r -' ,d7 


e «(t-T)d T . 


Alternatively we may use (106): we have 


(1 -acQ) 2 


<j>(t) = |"e‘«‘-'V(T)<iT = fT dr 

= J' [ fe“«-«W£) dr J dt = JV(g) ((- g ) e*«-B if. 


and so on as in proving 7*02 (6). 

7-05. Set of n linear differential equations of the first order. Consider the 
equations 

e liyi + e 12^2 + • • * + & \nVn = S v 
e 2l2/l + e 22^2+ ••• + e 2n2/n == ^2> . 


e nlVl "i" + • • • + ^nnVn 


where y v y 2 , y n are dependent variables, t is the independent variable, S 1 to S n are 
known integrable functions of t, in 0 < t < T, and 

= a rsJ t + b r 8 > (2) 

where a r8 and b r8 are constants. We do not assume at present that a r8 = a^, b rs = b^, but 
we do assume that the determinant 

A — \ a n\ ( 3 ) 


is not zero. If it is, we shall be able to show later that there is a defect in the specification 
of the conditions. 





7*06 Simultaneous equations 

Using the summation convention we can write the equations in the form 

a «W + 6rs2/s = 8f * 
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(4) 


Perform the operation Q on both sides of each equation; that is, replace t for a moment by 
an auxiliary variable r and integrate with, regard to r from 0 to t. We get 

o> ra {y,-u 8 )+b n Qy 8 = Q$r> ( 6 ) 

u s being the value of y 8 a,tt = 0. We rearrange this in the form 

fraVs = (»« + b T8 Q) Vs = a raU 8 + QS r . (6) 

These equations take account of both the differential equations and the initial conditions. 
Now let D denote the operational determinant 

= !/«!• (7) 

If this determinant is expanded by the rules of algebra we shall obtain a polynomial in Q, 
in general of the nth degree. The term not containing Q is A, which by hypothesis does not 
vanish. Let F rs be the cofactor of/ r4 in this determinant. F rs also is a polynomial in Q. 

Now operate on the first equation of (1) with F lmi the second with F Zm , and so on, and 
add. We have 

^rmfraVa = F rm (a r8 U e +QS r ). (8) 


But F„J n = 0 unless m = s, for it is a determinant with two columns equal. If m — s. 


F rm f„ — D. Therefore 




( 9 ) 


The expression on the right is a bounded integrable function of t because the S r are. Also 
the function D(z), obtained by replacing Q by a number z, is not zero at z = 0 because 
A =f= 0. Hence 1 /H(z) can, for z less than some positive quantity p, be expressed as a power 
series in z. In accordance with the rule of 7*04 we define D _1 as the power series in Q 
obtained by putting Q for z in the series for 1/D(z). Then operate on both sides of (9) 
with D~ x . We get 

I>- x Dy n = D- 1 F rm {a„u, + QS r ). (10) 

But series of powers of Q can be multiplied together according to the rules of algebra; 
hence D~ 1 Dy m is simply y m , and we have the formal solution 

y m = D~ 1 F rm (a r8 u 8 + QS r ). (11) 

The fundamental rule of interpretation is that the operators are to be multiplied out and 
interpreted term by term, but we shall see that they can all be reduced, at the worst, to 
single integrals by means of rules that we have already. The series D~ x operating on an 
integrable function always gives a convergent series; hence the result has a meaning, and 
must be the solution corresponding to the differential equations and the initial con¬ 
ditions if these have a solution at all. To show that there actually is a solution we must 
verify that (11) satisfies the initial conditions and the differential equations. 

First, if t tends to 0 all terms containing Q tend to 0; then D~ x tends to A~ x F m to A , 

v 7 rm Tin* 

the cofactor of o m in D. Hence for t = 0 

Vm = 


(12) 
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But A^tora = 0 unless m = 8, when it is equal to A. Hence 


7 051 


Vm ~ u m> 

and the solution satisfies the initial conditions. 

Secondly, a rs = f„ - b rs Q, (13) 

and therefore the solution can be written 

Vm = £ _1 ^rm(/r S W s - b rs U s Q + QS r ). (14) 

The first term as before reduces to u m on summation. Hence 


y m = U m + D- 1 ! 1 ™ Q( S r - Ka^s), (15) 

and the last term consists of positive powers of Q operating on a known function. But 



ti Q i {t) =U\ f{T)dT=m 

(16) 

Hence 

{avmj t + b vn^ QfV) = ( a vm + b vmQ)f(t) = fvmfQ)* 

(17) 


^ d - b vn ^ y m b vm u m ^ F rm (S r b rs it s ). 

(18) 


But again f vm F rm = 0 unless r = v, when it is equal to D. Hence the last term reduces to 
S v — b vs u s . The second term cancels the term b vm u m and finally 


which shows that the solution obtained satisfies the differential equations and completes 
the proof that the problem has a unique solution given by (11). 

We now consider the case of A — 0. Multiply the equations (1) by the respective A rm 
and add. Then a ra A rm — 0 even for s = m, since the sum is then A. Hence 

ArmKaVs = A^S,, ( 20 ) 

for all t, and in particular for t = 0. Thus the values of the y 8 at t = 0 cannot be assigned 
independently. If they are assigned so as to satisfy (20) there is a fixed relation between 
the y s for all time and one of the variables can be eliminated; if they do not satisfy (20) the 
conditions are self-contradictory. The condition that A=£ 0 (i.e. a rs is a matrix of rank n) 
therefore expresses the condition that the initial values of the unknowns can be assigned 
independently, and will be satisfied in any properly stated problem. 

7*051. The symbol p. The process of getting the operational solution (11) from (6) 
is the same as that of solving a set of algebraic equations in the y 8 . The above procedure 
is the most convenient for establishing the general theorems, but the actual evaluation 
of the operational solution is made easier by a change of notation. We replace Q by p~ x \ 
the rule that operators must be expressible in the form 

o 0 + ®i Q+... 

then becomes the rule that they must be expressible in the form 

a 0 + a 1 p- 1 +..., 



^Vm = 
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where the coefficients are such that the series 
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a 0 + a 1 z+... 


converges for some value of z different from 0. Rewriting in this notation the interpreta¬ 
tions that we have so far obtained we have 


p~n = t , p~n = \t 2 , p~ n 1 = ^, 

2> -1 /(*) = dt, p~ n f(t) = /(t) dr, 

oo /*< / 00 /#_ T \n-1\ 

2 = «o/(0 + /(t) s «n n , dr, 

»=0 Jo \n=l (n»—1)!/ 

r^i 1 * c **’ = f /( r ) e a(/ - T) dr, 

p —a p — a Jo 

P , < n_1 , 1 

___ 1 — _ n<xt _ 

(p — a) n (n — 1)! ’ (p — oc) n 


m= S/< T) -0m e ^ ,dT ] 


( 21 ) 


The advantage of this notation is that the operators in the last two equations expressible 
by a single term have p or 1 in the numerator instead of Q n ~ x or Q n . 

We also have immediately by direct expansion 


P 2 

p 2 + w 2 

P i 

p^ — n* 


1 = cos nt, 

1 = cosh nt, 


np , 

p* + n 2 


= sin nt. 


0 1 = sinhni. 
p 2 ~n 2 


( 22 ) 

(23) 


Returning to (6) we see that as the solution is a purely algebraic process, if we write 
p- 1 for Q in each of the equations (6) and then formally multiply by p, and carry through 
the solution by algebra we shall arrive at the same solution, provided that we keep to the 
fundamental rule that operators are to be expanded in zero and negative powers of p 
before interpretation. But with this rule we get in place of (6) 


(a r8 p + b rs ) y 8 = pa rs u s + S r . (24) 

These equations are called the subsidiary equations. They are formed from the differential 
equations as follows, as we see on inspection. 

Write p for d/dt on the left of each equation; to the right of each equation add the result of 
dropping the b rs on the left and replacing the y s by their initial values. The resulting subsidiary 
equations are to be solved by algebra as if p was a number; and the result is to be interpreted 
by expanding in decreasing powers of p and interpreting p~ x as the operation of integrating 
Jrom 0 to t. 


7*052. Partial fraction rule. Since the operational solution (11) is expansible in 
powers of Q or p~ x , beginning with a constant term, the operator must be of the form 
F(p)/G(p), where F(p) is a polynomial in p of the same degree as G(p) or lower. If p is 
replaced by a number z, F(z)/G(z) is a rational function of z and can therefore be expressed 
in partial fractions. Each such fraction can be expressed in descending powers of z, 
possibly beginning with a constant, and the sum of the expansion is the expansion of 
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F(z)fO(z). Consequently if we formally break up the operator into partial fractions and 
apply each separately to a given function, the sum of the results is the result of applying 
the expansion of the operator F(p)/G(p) to the same function. 

The resolution is particularly simple when G(z) has only simple zeros of the form z = a 
and is not zero at z = 0. We have then the algebraic identity 

P(P) P(0) , y *(*) 1 

pG(p) pG(0) + iaG'{a)p-a’ 


whence 


P(P) = F(0) . y Pipe) p 
G(p) G(0) + ~a<?'(a)p-a* 


G(p) G(0y?aG'(a) * 


( 26 ) 


Hence the part of the solution that depends on the initial conditions is expressed directly 
in finite terms. A different form is more convenient when the function operated on is not 
a constant; we can write 


J (i>) _ -F(z) , ^ n*) 1 

G(p) 

S->*00 G{z) aG'(a)p-oc ’ 


F(p) 

G(p) 


S(t) = lim 

Z->CO 


| v ^( a ) f* 

«(*) () + ?fl"(a)Jo 


S(T)e<*-^dT. 


( 26 ) 


The interpretation (26) is often called Heaviside’s expansion theorem. But his methods 
involve two other expansion theorems, namely, expansion in powers of p~ x and in powers 
of e~ ph , where h is a constant, and in the present treatment the former is fundamental. 
Consequently (25) will be called the partial fraction rule in the present work. It can be 
read: Divide by p, put into partial fractions, multiply by p and interpret. 

If there are multiple zeros of G(p), or if it contains p as a factor, the expression of 
F(p)/pG(p) in partial fractions can still be carried out, but there will be terms of the form 
p~ 8 or (p — a)~ 8 , and F(p)IG(p) will contain terms of the forms p-* 8-1 ) or pl(p-ot) 8 . These 
can be interpreted by means of (21). Consequently, whenever the functions S r are 
integrable and the initial values of the unknowns can be assigned independently the 
solution can be obtained by operational methods and the result can at worst be expressed 
in terms of a finite number of single integrals. 

A convenient way of finding the terms in (p — a) - * may be illustrated by the following 
example. Take 

Ftp) =- - -. 

P (p+l) 2 (p + 2)* 

When z->— 1, lj(z + 2) tends to 1; then 

F(r) s _ P _ P ( 1 _ A P _ P P 

(P+ 1) 2 (p+l) 2 \p + 2 J (p+l)(p + 2) p+1 p + 2’ 

F(p) 1 = (t—xt\2- + 1 = te^-e^+e- 2 *. 

up+i) 2 p+1 p+y 

By subtraction we can in this way reduce the highest index in the denominator by 1 at 
each stage. Each step checks the algebra of the previous one. 
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7*053. Principle of superposition. It is, however, sometimes inconvenient to have 
to use two different resolutions into partial fractions according as the operand is a constant 
or not. This can be avoided by the principle of superposition. We have in the p notation, if 

F{p) = a 0 +a x p- x + a 2 p~* +..., 


(27) 


(28) 

since /(0) = a 0 . Hence if we know F(p) 1, the evaluation of F(p) 0(f) is reduced to a 
single integration. Using this result we can derive 7*052 (26) from (25), and need only 
one resolution into partial fractions. 

This theorem can be interpreted physically as follows. We can regard a system as 
subject to disturbances represented by our S r (t). But if it was in the state of y a — 0 up 
to time 0 and then the y a were suddenly raised to u 8 , we could represent this as due to a 
set of impulsive disturbances, thus virtually absorbing the initial values into QS r at the 

cost of making J S r (r) dr — a ra u a in the limit when 8 is made arbitrarily small. Then the 

term 0(O)/(f) can be regarded as the residual effect at time t of the impulsive disturbances 
0(0) at time 0. The later disturbances due to S r or 0(f) can then be regarded as the resultant 
effect of numerous small disturbances S r dr or dfi(r) in time dr. Each produces its residual 
effect at time f, but the interval is now f—r instead of f. Consequently their total con¬ 
tribution is the sum of elements of the form f(t — r)d0(r), which gives the form of the 
integral. It is not necessary for this purpose that 0(f) should be differentiable, but if it is 
not the integral is not the usual Riemann integral but the extended form due to Stieltjes. 
(Cf. 1*10, 1*102.) 

7*054. A third method is often most convenient when the operand itself can be 
expressed in the form G(p) 1 = g(t). Then 


sav, then 


f 2 f n 

F{p) 1 = Uo + tijf-f a 2 —+... +a n — +... 
f'(t) = a 1 +a 2 t+...+ ^yyj + .... 


=/(*), 


and from the third line of (21) 

F(p)f>(t) = O’ 0 f>{t)+ J* J>{r)f'{t-r)dr . 

Integrating by parts we have 

F{p) 0(f) = 0(O)/(f) + f f{t - r) d0(r), 

Jr -0 


F(p)g(t) = F{p)G(p) 1, 


(29) 


and we can proceed directly to the interpretation of the right side by the partial fraction rule. 

7*06. Equations of higher order. The method is most easily extended to equations 
of higher order by breaking them up into equations of the first order. Thus if we have 
an equation of the second order such as 


d*x dx , 
dtf +a dt + hx ~ 1 


( 1 ) 
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with x = aj 0 , dxfdt = at £ = 0, we introduce a new variable y given by 


7061 


dx 

dt 


-y = o, 


(2) 


and the original equation can be replaced by 

^ + bx + ay = 1. (3) 

Then (2) and (3) are two equations of the first order, and the subsidiary equations are 


px-y = px 0 , 
py + bx + ay -px t + 1. 

Eliminating y by algebra we get 

x = {P 2 +ap)Xo+PX i+l j 
p 2 + ap + b 

which we can interpret by the partial fraction rule on putting 

p 2 + ap + b = (p-a)(p-fi)- 


(4) 

( 5 ) 

( 6 ) 

(7) 


7*061. If we have n differential equations of the second order, possibly with variable 
functions on the right, we proceed in the same way. If a typical equation is 


we take 


dy_s . i dy 8 ___ a /f\ 

a rs ^2 4" 6 rs dt C r8 ^r(*/» 
z 

s_ dt 


( 1 ) 

( 2 ) 


as defining a new set of variables z 8 ; then (1) can be written 

cl% 

ars-^ + Ks^ + Crsy* = S r {t), 


(3) 


and instead of n equations of the second order we have now In equations of the first order. 
The operational method of solution will then work provided that the initial values of all 
the y s and z s can be assigned independently. If they are u 8 and v 8 we write the subsidiary 
equations 

Py 8 -z a = P u 8> ( 4 ) 

(a„P + b ra) z s + ^8 = S r (t) + <l r8 pv 8 . (5) 

The first step of solving is to eliminate z 8 between these two; then 

(a r8 p + b rs )p{y 8 - u 8 ) + c r8 y 8 = S r (t)+a r8 pv s , (6) 

that is, (a r8 p 2 + b r3 p + c r3 ) y 8 = S r (t) + (< a rs p 2 + b r3 p) u s + a rs pv 8 . (7) 

These can be solved for the y 3 as for a set of first order equations and the same rules of 
interpretation apply. The allowance for the initial values of y 8 and dyjdt is made by the 
terms in u 8 and v 8 on the right. 





7*062 Further operators 

The determinant of the coefficients of dyjdt and dzjdt in (2) and (3) is 


«u 

a %\ 

a n\ 


*12 


a m 0 

... 0 


a 


nn 


o 


0 


0 

0 

0 

1 
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so that solution is possible with arbitrary initial values of all y s and dyjdt provided again 
that [ o„ || =# 0. 

The physical interpretation of the condition A = || a r8 1| 4= 0 is clear in this case. The y 8 
may be the coordinates of a dynamical system and will satisfy differential equations of 
the second order with regard to the time. Then the condition that the initial values of 
y e and dyjdt can be assigned independently amounts to saying that the coordinates 
chosen and their rates of change may have any initial values. 

7*062. We have already had the rules 


P* 


cos nt, 


np 


1 = sin wi. 


( 1 ) 


p 2 +n 2 ’ p 2 + n 2 

These can also be verified by the partial fraction rule. If we differentiate with respect 
to n the series expansions of these operators in powers of p~ x we get series that con¬ 
verge; hence 


2 n 2 p s 


\p 2 + n 2 


whence 

7*04 (106) may be written 


(p 2 + n 2 ) 2 
2 n 2 p 
(p 2 + n 2 ) 2 

2n 3 p 


(p 2 + n 2 ) 2 


1 = nt sin nt, 

11 = t cos nt, 

1 = sinnt — ntco&nt. 


e - * 1 —-— f(t) = - (t)}. 

p — a p K 


Hence 


e~rtF{p-*)f(t) = F (p) {e^f (t)}. 


1 


In particular 

«- , ■ P(P Z a K. 1 = + 1 = ! =. CO s fit, 


(p — a) 2 +/2 2 p 2 +fl 2 


p 2 +J 2 p + a (p 2 +fi 2 ) 


and therefore 
Also 

and therefore 

J*F 


P{P~<x) 
(p — a) 2 +fi 2 


1 = e at cos fit. 


gp + CC) _ pp B 

(p-a) 2 +P 2 p 2 +fi 2 p 2 +p 2 ~ P ’ 


Pp 


(p — a) 2 +/P 


1 = e** sin/ft. 


( 2 ) 

(3) 

(4) 

(5) 

( 6 ) 

(7) 

( 8 ) 

(9) 


16 
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These are given here as an illustration of 7*04 and 7*054. Alternatively, we can apply 
the partial fraction rule directly: 


P 

p — oc — ifi 


e (a+ipx _ e«*(cos fit + i sin fit), 


and we separate real and imaginary parts. 


7*07. We sometimes want the limits, if any, of 


F(p) 

0(p) 


1 


and its integral as t tends to in¬ 


finity. The problem of the induction balance in the next chapter is an instance. These can 
be simply found from the partial fraction rule. It is not necessary to consider repeated 
factors, since we can separate them by making small changes in the constants. The roots 
a must all have negative real parts, otherwise the interpretation would contain ex¬ 
ponentials with positive indices and increase without limit, or else trigonometrical 
terms, which will oscillate finitely. Then 


F{p ) 

O(p) 


F(0) ^ F(oc) 
G( 0)"*' aG'(a) 


& **, 


( 10 ) 


and the limit as t tends to infinity is F(0)/G(0). Also if the integral is to have a finite limit 
F(0)/G(0) must be 0; then 


j; 


2 F{a) ^ _ S 

aG'(a) e dt ^a?G'(a) 



r F(oc) 1 .. F( A) 

i“ S aG'(a)A-a a U “A(?(A)' 

(11) 

Hence 

J(°) 

^oo G(p) G(0) ’ 

(12) 


U _1 F(p) 1 JP(A) 

^ P G(p) A h “ Aft(A)- 

(13) 


provided that the limits on the right exist, and that all zeros of G(p) have negative real 
parts. 


7*08. In dynamical applications G(p) is often an even function of p with simple zeros 
± in. We can separate F(p) into even and odd parts, thus 

F(p) = i{F(p) + F(-p)} + UF(p)-F(-p)} = S(p)+pT(p), (14) 

where S(p) and T(p) are even functions; and then 


F(p ) . S'(p)+pT(p) S( 0) 8{in) + inT(in) ini 
G(p) G(p) G(0) + inG'(in) ‘ 


Taking the terms from e ±int together we have 


fl(0) , ^ 2S(in) 
G(0) inG'(in) 


coant —2 


2 nT{in) 
inG'(in) 


sin nt. 


(15) 


(16) 





7*09 


Solution in cosines and sines 
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But = - 2K2 {^2°(?> 2 )} ii ,__ n . 

F(p) _ S(0) S(in) cos nt — nT(in) sin nt 

m =m ~ ■ 

Since S and T are polynomials in p containing only terms of even degrees this expresses 
the interpretation directly in real form in terms of trigonometrical functions. 

7*09. The Heaviside unit function. This function H(t) is defined by 


#(*) = 0 (<<0); H(t) = 1 (t> 0). (1) 

Evidently p~ n H(t)=p~ n l (<>0); p~ n H(t) = 0 (t< 0). (2) 

Hence if F(p) 1 = f(t), 

F(p)H(t) =f(t) (t> 0), F(p) H(t) = 0 (t<0), (3) 

and in general F(p)H(t) =f(t)H(t). (4) 


We are usually interested only in positive values of t; and then it is irrelevant 
whether F(p) is supposed to operate on 1 or on H(t). Then for either F(p) 1 or F(p) H(t) 
it is customary to write simply F(p) and leave the fact that F(p) is supposed to operate 
on 1 or on H(t) to be understood. 


(17) 

(18) 


EXAMPLES 

Solve the following differential equations with the initial oonditions stated: 
d*z dx 

1. — + 4 — + 33=1; X a = 3, x, = - 2. 

dt i dt 1 

2. ^+6^ + 6«=12; x o = 2,z 1 = 0. 

cur ctt 

3. ^ + Zx = e~ u ; x 0 = 0. 
at 

d*x dx 

4. —+ 4 — + 4x = t 3 e~**; x 0 = 0, z t = 0. 
at* at 

d*v d?v d 2 v dv 

5. ^ + 6 ^+ n T^ + 6 /= 20e " 2 * sin!K » 

dx* 4xr dx* dx 


given that 

6. Solve the equations: 


!=»■ 2 -* 2 - 


given that z = l,y = 0 when t = 0. 
7. If 

prove that 


dx 

— + 5x+2 y = e-*, 
at 

dy 

— + 2x + 2y = 0, 

(tv 


F(P) 1 =/(«) 
^F(p)- F'(p)} 1 = 


(M/c, 1930.) 


(Prelim. 1945.) 


♦ 


16-3 





Chapter 8 

PHYSICAL APPLICATIONS OF THE OPERATIONAL METHOD 

Cut the cackle and come to the hosses. 


8*01. Charging of a condenser. An electric circuit contains a cell, a condenser and 
a coil with self-induction and resistance. Initially the circuit is open. It is suddenly 
completed; find how the charge on the plates varies with the time. 

Let y be the charge on the condenser, t the time, C the capacity of the condenser, 
L the self-induction, R the resistance of the circuit and E the electromotive force of the 
cell. The current is y, and the charging of the condenser produces a potential difference 
y/G tending to oppose the original e.m.f. Then y satisfies the differential equation 


= Ly + Ry . 


( 1 ) 


Initially y and y, the current, are zero. Hence the subsidiary equation is simply 

{Lp 2 +Rp+^jy = E, 

and the operational solution is 

_ E _ E 

y ~Lp 2 +Rp+l/G L(p + a) (p + ft)* 


( 2 ) 

(3) 


say. The interpretation is, by the partial fraction rule, 

E , Ee-«* t Ee-P* 

V ~Lrf + L(-a)(-a + /3) + £(-/?) (-/? + a) 

- M0+ i£ji(i r *-i tr *Y (4) 

Since a +ft and a ft are both positive, a and /? must be either both real and positive, or 
else conjugate complexes with positive real parts. In either case y tends to a limit GE, 
as we should expect. 

We notice that if the circuit contained no capacity or self-induction the differential 
equation would be simply 

Ry = E. (5) 

Hence if the solution has been found for simple resistances, self-induction and capacity 
can be allowed for by writing Lp + R+ l/Cp for R. For this reason this expression is 
sometimes called a resistance operator, and the operational method generally the method 
of resistance operators. The exponential terms in the solution become negligible after a 
short time, though they are important in experiments where we need to know how long 
it will take to approach a steady state. They are often called the transient . 

8*02. Alternating e.m.f. applied to a coil with self-induction. Let x be the 
current produced. The e.m.f. is v cos nt, which we can take as the real part of ve ini . Then 
we have to solve 

Lz+ Rz = ve ini = ~ p .~ , 
p — xn 
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and if 2 is initially zero the initial conditions contribute nothing to the subsidiary equation. 
Thus the operational solution is 


vp 

(Lp + R)(p — in) 

v ( e ini t e -Bi/L \ 
L \m + RjL — RJL — in) 


and the real part of this is 


v(R-Lin) 
_L 2 n 2 + R 2 


( e inl _ e -Rl!L^ 


X 


V 

LH 2 + R 2 


(R cos nt + Ln sin nt — Re~ m ! L ). 


The first two terms give a harmonic variation, out of phase with the e.m.f. The last term 
gives the transient, which becomes negligible after a time of order LjR. 

The harmonic part has amplitude v/(j &*»* + R 2 )\ and can be written as the real part of 

ve ini 

R+Lin' 


This is the basis of the so-called ‘vector diagram’, which has nothing to do with 
vectors, but is a special case of the geometrical representation of complex quantities 
usually associated with the name of Argand, though he was anticipated by Wallis and 
Wessel. 


8*03. Discharge of a condenser in one of two mutually influencing circuits. 

Suppose that we have two similar circuits, each with self-induction L and containing a 
condenser of capacity C , but negligible resistance, and that the condenser in one has a 
charge x 0 initially and the other none. The coefficient of mutual induction is M. The first 
circuit is closed; find the ensuing variations of the charges. 

Put CL = 1/a 2 , M — Lfl\ if x, y denote the charges on the condensers in the two 
circuits 

L{x+fiy + a?x) = 0, ( 1 ) 

L(fix+y + a 2 y) = 0. (2) 

Initially x = x 0 , x = 0, y = 0 , y = 0, ( 3 ) 

and the subsidiary equations are 


On solving by algebra 


{p 2 + a 2 ) x + fip 2 y = p 2 x 0 , 
fip 2 x + (p 2 + a 2 ) y = fip 2 x 0 . 

y 


that is, 


p 2 {p 2 + a. 2 ) — fi 2 p* fip 2 (p 2 + a. 2 ) — ftp* (p 2 + a 2 ) 2 —fi 2 p** 

x y x 0 


(l-fl 2 )p 4 +p 2 cc 2 §cl 2 p 2 {(l+/ff)p 2 + a 2 }{(l-/?)p 2 + a 2 }* 


(4) 

( 5 ) 

( 6 ) 
( 7 ) 
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Then 


If we write 


Determination of self-induction 

_ p 2 ( l-/3 2 ) + a 2 2 

X {(l+fi)p 2 + OL 2 }{{l^^)p 2 + OL 2 }^ Xfi 

( (1 +/?)P* , (1 ~fi)P 2 K 

1(1 +fi)p 2 + a 2 (1 —f})p 2 + <x 2 \* 0 

= ( 0O ^a^) (+CO8 VaV)‘) K ’ 

fia? a 

y ~ {(l+«l) 2 + a a }{(l-/»)3> i! + a i! r ° 

f (i+/?)y* _ (1-/?)P 2 \ . 

*' \(l+j3)p 2 + oc 2 (1 -fi)p 2 + a 2 }* 0 

“HvaTj)'" 008 ^)') 1 * 0 - 

^r^) = 7+s - j(T+fi) = 7 ~ s ’ 


the solutions take the forms 


8-04 


( 8 ) 


(9) 

( 10 ) 


x = x 0 cos yt cos St, y = # 0 sinyi sin 8t. (11) 

If M is small, 8 is small, and the disturbance consists of a rapid oscillation in period 2 njy, 
with the amplitude varying so that the oscillation is transferred from one circuit to the 
other in time tt/2 8. This is the case of beats due to weak coupling. A similar phenomenon 
is well known for two pendulums hanging on the same support, the support being not 
quite rigid, so that one pendulum influences the other by displacing the support. The 
same phenomenon of the transfer of the vibration from one pendulum to the other occurs 
at regular intervals. 

If the coupling is strong, so that ft is nearly 1, the two periods 2nyJ(l± /?)/a are very 
different, and the variation of the charge consists of a rapid oscillation superposed on a 
slow one of equal amplitude. The slow component has the same phase in the two circuits, 
the rapid one opposite phases. 


8*04. Rimington’s method of determining self-induction.^ In this method the 
unkn own inductance is placed in the first arm of a Wheatstone bridge; the fourth arm is 
shunted, a known capacity being placed in the shunt. 

First consider the ordinary Wheatstone bridge, the resistances 
of the arms being RyR 2 R 3 R it G that of the galvanometer, 6 that 
of the battery and leads; x is the current in R Xi y that in R 2 , g that 
through the galvanometer. Then 


RyX-R 2 y + Gg = 0, 

(1) 

R 3 x-R 4t y-(R 3 + R 4 + G)g = 0 

(2) 

b(x + y) + R 2 y + Ri(y + g) = E, 

(3) 



x + y 


.and on solving (1) and (2) we find 


x + y 


R 2 R 2 - RyR± G(Ry + R 2 + R 2 + My) + (Ry + R 2 ) (R 3 + 


(4) 


* E. G. Rimington, Phil. Mag. (5) 24, 1887, 54-60; Bromwich, Phil. Mag. (6) 37, 1919, 407-19. 
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If g is small compared with x and y we have nearly 

<*> 

_ (#+y) (i? 2 -^3 ~ -^1-^4) 

y ^ + ^ + ^3 + ^)* W 

The important feature of the arrangement is that <7 = 0 if R 2 R 3 = R X R 4 , irrespective of 
the accuracy of the approximation (5). 

According to our first result we can allow for the self-induction L in the • 

first arm by replacing R x by Dp + R v Let the arrangement in the fourth / 

arm be as shown. The resistance of the main wire is R 4 , that of the shunted /\ 

portion of it r. The shunt has resistance S. Then the effective resistance of / v) 
the whole arm is r> 

rS „ r 2 /S 


Ra-t + 


' r+S * 


If the shunt contains a capacity G we allow for it by replacing 8 by S + 1 /Op. Hence in 
the formula (6) for g we must replace R x by Lp + R li and R± by 

7 * 2 r 2 Gp 

Ri - r + S+l/Cp = R *~(r+S)Cp+V (7) 

The result expresses the current through the galvanometer when the battery circuit is 
suddenly closed. 

It can be shown that in actual conditions g cannot vanish for all values of the time. 
A sufficient condition for this would be that the modified operator R 2 R 3 — R t R^ should 
be identically zero; then g would vanish whatever the remaining factor might represent. 
A little consideration will show that this condition is also necessary. This factor is 
modified to 

< 8 > 

Multiplying up and equating coefficients of powers of p to zero, we find 

Rt{r + S) = r\ (9) 

— LR X + (R 2 R 3 — RxR x ) (r + 8) G + R x r 2 C = 0 , ( 10 ) 


R 2 R z — R x R^ — 0 . 


From the construction of the apparatus r < R if 0 . Hence (9) can hold only if r = R A 
and 8 = 0; the shunt wire must be attached to the ends of R± and have zero resistance. 
( 11 ) is the usual condition for balance; and substituting in ( 10 ) we have 

L = R 1 R i G. ( 12 ) 

These conditions cannot be completely satisfied. But the changes of current on closing 
the battery circuit are so rapid that an ordinary galvanometer will not follow them. If 
the current settles down to nothing we have the usual condition for balance; but if there 
is a resultant flow through the galvanometer in one direction or the other it will act on the 
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galvanometer as an impulse, and there will be a ballistic throw. The condition that g 
tends to 0 and that there shall be no ballistic throw are 


g-> 0 , 



(13) 


But these are satisfied, by 7-07, if the constant term and the term inp in the operational 
form for g vanish; and again, irrespective of the approximation (5), we can use ( 8 ). The 
formal limit when p -* 0 is 

R 2 Rz — R\R 4 — 0 (14) 


as before. The coefficient of p is 


LR 4 — R 1 r 2 G, 


(15) 


and the vanishing of this is the condition for no ballistic throw. Condition (9), which 
came from the terms in p 2 , no longer arises. The method is therefore first to set up the 
bridge in balance in the usual way, thus satisfying (14); and then to connect the shunt 
containing the capacity to different points in the arm R 4 so as to vary r. When the adjust¬ 
ment is such that there is no ballistic throw r is determined, and then (15) gives L. 


8*05. The seismograph. In principle most seismographs are Euler pendulums— 
pendulums with supports rigidly attached to the Earth, so that when the ground moves it 
displaces the point of support horizontally and disturbs the pendulum. The seismograph 
differs from the Euler pendulum as considered in text-books of dynamics in two ways. 
Instead of being free to vibrate in a vertical plane, it is constrained to swing, like a gate, 
about an axis nearly, but not quite, vertical, so that the period is much lengthened; and 
fluid viscosity or electromagnetic damping is introduced to give a frictional term pro¬ 
portional to the relative velocity. The displacement of the mass with regard to the Earth 
then satisfies an equation of the form 

x + 2 kx + n 2 x = A£, ( 1 ) 

where £ is the displacement of the ground and k, n, A are constants of the instrument. 
Some instruments, such as those of Wiechert and Wood-Anderson, are not on 'the prin¬ 
ciple of the Euler pendulum, but nevertheless give an equation of this form. Others are 
arranged to record vertical displacement of the ground; this requires a heavy mass 
elastically supported, and is convenient for ground movements of short period, as in 
seismic prospecting. For longer periods it is more difficult to design an instrument such 
that x will satisfy a linear differential equation, but the difficulties have been overcome in 
several different ways, and the differential equation is again of the form ( 1 ). 

The first object of the instrument is to record as accurately as possible the time of any 
sudden change of the velocity of the ground. The second is that when such a change has 
been recorded the instrument shall return as quickly as possible to its original position 
so as to be ready to record any later disturbances. 

Suppose first that the‘ground suddenly acquires a finite velocity, say unity. Then £ 
jumps from 0 to 1 , and therefore x from 0 to A. The initial conditions are therefore 

x = 0 , x = A, ( 2 ) 

and our subsidiary equation is 

( p 2 + 2kp + n 2 ) x = Xp {t> 0). 
p 2 + 2/cp + n 2 = (i> + a) {p+fi). 


Put 


(3) 

(4) 




8*05 

Then 


Seismographs 

~ (P + a)(P + P) 
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(5) 

( 6 ) 


The recorded displacement x therefore begins by increasing at a finite rate A, reaches a 

( op\ i/(<*—# 1 a 

— I after a time- 5 log - Q , and then tends asymptotically to zero. 

a*/ cc — p p 

If a and /? are real, and fi < a, the behaviour after a long time depends mainly on er^\ 
to confine the effects of a disturbance to as short an interval of time as possible, we should 
therefore make /? as large as possible. But 

(7) 

and for given n, fi is greatest (given that it is real and therefore K^n) when k — n. This is 
the condition for what is called aperiodicity. The solution then reduces to 

(<>0) - (8) 
The iriftYimnm displacement is now at tune 1 jn and is equal to A jen. 



If k < n, we can put 
Then (6) becomes 


n % — ac 2 = y 2 . 

x = -e-^sinytf, 
7 


(9) 

( 10 ) 


and the motion dies down more rapidly the larger k is, in the range considered. The 
aperiodic state k = n therefore gives the least motion after a long time for given n. 

In practice, however, k is usually made rather less than n. In the Milne-Shaw instrument, 
for instance, k is about 0-7 n. The motion is then oscillatory, but the ratio of the first 
swing to the second is e™ 1 ?, about 20. But x vanishes after an interval n/y from the start. 
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or about 4/n, and ever afterwards is a small fraction of its first maximum. The reduced 
damping effect after a long time is considered less important than the quick recovery to 
zero after the first maximum. The time of the first maximum is l*l/n from the start, as 
against 1 jn for the aperiodic instrument and 1-57/n for the undamped one. 

The Galitzin seismograph is similarly arranged, but the motion of the pendulum, 
instead of being recorded directly, generates by electromagnetic induction a current, which 
passes through a galvanometer. If a; is the displacement of the pendulum, and y that of 
the galvanometer mirror, the differential equations are 

x + 2K x x+n\x = A£, (11) 

y + ZK^y + nly = fix , (12) 


where the reaction of the induced current on the pendulum 
the ground to start with unit velocity, we have 


y = 


A ftp 2 


( V 2 + 2 *\V + n\) (p 2 + 2 K z p + n%) 


is neglected. Again supposing 
(*>0). (13) 



In instruments of the original design k and n were made the same for both the inter¬ 
acting systems, and both were made aperiodic, so that 


k x = k 2 — n x = n 2 — n. 


(14) 


Then 


A ftp 2 

V (p + w) 4 ’ 


(*> 0 ) 


= £A /i{t 2 -\nt z )e-«*. (15) 

The indicator therefore begins to move with a finite acceleration, instead of with a 
finite velocity as for the pendulum. The maximum displacement follows after time 
(3 — *j3)/n = 1-27/w, the mirror passes through the equilibrium position after time 3/n, 
and there is a maximum displacement in the opposite direction after time 4*73 Jn. The 
mirror then returns asymptotically to the position of equilibrium. The ratio of the two 
extreme displacements is e 2v ' 3 /(2 + % /3) 2 = 2-3. In comparison with a partially damped 
instrument such as the Milne-Shaw, recording directly, the Galitzin machine gives the 
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first maximum a little later, the first zero a little earlier, and the next extreme displacement 
is larger in comparison with the first. It will be seen from the graph that in spite of the 
fact that y for small t is proportional to t 2 instead of t, it begins to increase rapidly so soon 
that the beginning of the movement can be very accurately read. 

Later modifications of the Galitzin instrument have been to abandon the relations (14) 
by reducing the damping and by making the galvanometer period shorter than the pen¬ 
dulum period. For harmonic motions of the ground this makes the magnification vary 
less with the forced period. It does not seem possible, however, to reduce the overswing 
on recovery after an impulsive change in the velocity of the ground. In some modem 
designs the reaction of the induced current on the pendulum can no longer be neglected.* 

8*06. Resonance. A simple pendulum, originally hanging in equilibrium, is disturbed 
for a finite time by a force varying harmonically in a period equal to the free period of the 
pendulum. Find the motion after the force is removed. 

The differential equation is 

x + n 2 x=fsmnt=f-^^ (0 <t<T), 
with x = 0, x = 0 at t = 0. Then 

* - = ^ (sinK< -’ ltC08Kt) - 

The motion can therefore be regarded as a harmonic motion of continually increasing 
amplitude. Suppose that the disturbance acts for a time T — mjn, where r is an integer. 
At the end of this time - 

z = -^2 nr (- 1 ) r ’ ±=z0 ' 

The subsequent motion is therefore given by 

TTff Tirf 

x - — ( — 1 ) r 2^2 cos (nt-rn) = ~— 2 cosnt, 

and is therefore a harmonic motion with amplitude proportional to the duration of the 
disturbance. 

Linear differential equations in dynamics are usually the result of neglecting the square 
and higher powers of the displacement. What the result shows is that the amplitude, if 
the forced and free periods agree, will grow until the neglected terms need to be taken 
into account. 

8*07. Three particles of masses m, §£m, and m, in order , are attached to a light stretched 
string of length 4Z, dividing it into equal intervals. One of the particles of mass m is struck by 
a transverse impulse I. Find the subsequent motion of the middle particle. (.Intercollegiate 
Examination , 1923.) 

If x lf x 2 , x 3 are the displacements of the three particles and P the tension, we find in 
the usual way the equations of motion 

ajj = — A(2a?i a^), 

§"0^2 == A( ^+ 2^2 %z)> ’ (-0 

x 3 — A( x 2 + 2a? 3 ), 

* The theory is more fully developed by J. Rybner, Gerlands Beitr. 31, 1931, 259-81; 61, 1937, 
375-401; 55, 1939, 303-13. 
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8*08 

where A = P/ml. Initially, all the displacements are zero, x 2 = x s = 0, x x = 
the subsidiary equations are 

(p a +2A)a; 1 —Aa? a = pljmA 

Ijm. Then 


-^i + (Ml >2 + 2 ^) a; 2-Aa;3 = 0, i 

(2) 


- Ax 2 + (p 2 + 2A) x 3 = 0. j 


As we are asked only for the variation of x 2 we eliminate x x and x z . We have 


x x 

A pi A 

p 2 + 2\ X * + p 2 + 2\m' Xz ~p 2 +2X X *’ 

(3) 

and then 


(4) 

and on simplifying 

(7p 2 + 4A) (3 p 2 + 10A) x 2 = 20A.pI/m. 

(6) 

Then 

20 1 / Ip 3 p \ 

X% 58»n\7p 2 + 4A 3p 2 +10A/ 

(6) 


10 I /sin at sin fit\ 

~ 29w\ a j’ 

(V 

where 

a 2 = fA, P 2 = ^A. 

(8) 


If we want also the motions of the other particles they can be found by using (3) and 
applying the partial-fraction rule. They will contain terms with the same periods as those 
in (7), but also terms with period 2tt/J(2A). These correspond to a third normal mode, in 
which the middle particle does not move. This illustrates one great advantage of the 
operational method. We are asked only for x z , and the method gives it directly. In the 
usual method we should have to determine the amplitudes of all three normal modes 
separately, even though one of them is irrelevant to the question asked. 

8*08. Small oscillations in dynamics. Consider a dynamical system with a 
Lagrangian function given by 

2 L = a ra x r x 8 - c„x r x a , (1) 

so that the equations of motion are 

x s ^rs X a = ( 2 ) 

S r being any generalized force component applied to x r and not taken into account in the 
potential energy. If the system starts from rest and only one of the S r differs from zero 
we can write 

®msP (3) 

and the subsidiary equations are 


e ms x 8 = 0 (™*r), e ms x s = S r (m = r). 


(4) 


Writing A for the determinant of the and E rs for the cofactor of 

we have the operational solution 


e r8 in this determinant, 


(5) 


Notice that r is a particular suffix and is not summed over. Now the determinant A is 
symmetrical, so that E n — E„. Thus a given force S r applied to the coordinate x r will 
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produce exactly the same variation in x a as the same force would produce in x T if it was 
applied to x $ . Thus we have a reciprocity theorem applicable to all non-gyroscopic and 
frictionless systems. 

It is easy to see that friction does not affect the result if it is expressible by a dissipation 
function F — \b ra x r x a . In particular, the result is true for electrical networks. 

Now suppose that the force reduces to an impulse J r &tt — 0 and suppose that A has no 
repeated factor; we can write 

A = ATl(p 2 + a 2 ), (6) 

and replace S r by pJ r . Then 

3 - = E ™P T = V E rs(~ a2 ) P J r m 

S AU(p 2 + a 2 ) r in \-a 2 )p 2 + a 2i {) 


where E rs ( — a 2 ) and II'( — a 2 ) denote the results of putting — a 2 for p 2 in E rs and dLjdp 2 \ 
and then 


*s = 2 
a 


T 

an'(-a 2 ) r 


sin at. 


( 8 ) 


The separate terms have different periods, and the terms of the same period in different 
coordinates constitute a normal mode of the system. 

An immediate consequence is that if for some s and a, say a 1 , E ra ( — a 2 ) = 0 for all r, 
x 8 contains no term in sin a x t whatever impulses are applied; in other words, if x 8 is the 
displacement of a particle of the system, that particle is at a node of the mode in question. 
But then if we consider an impulse J s applied to x s we shall have 

%r = ^ aIT(~a 2 ) J * Sin at ’ ^ 

and again, since E„ — E rs , the term in sin a x t will have zero coefficient in every coordinate. 
Hence we have another general reciprocity theorem; no mode can be excited by striking 
the system at any node of that mode. It can be shown similarly that if the initial conditions 
specify initial values of the coordinates but the velocities are zero, the subsequent values 
contain terms with factors E rt { — a 2 ) cos at, and the initial displacement at a node of any 
mode will not contribute to the terms in that mode in the subsequent motion. 

This principle, in a continuous system, provided one of the crucial tests of the existence 
of deep-focus earthquakes. Most earthquakes occur at depths not over about 50 km., and 
produce, besides the waves that travel right through the earth, two types of surface waves 
explained theoretically by Rayleigh and Love. These resemble waves on deep water in 
that the displacements die down rapidly with increasing depth and are inappreciable at 
depths over about a wave-length, in this case something of the order of 50-100 km. By 
the above principle they should not be excited appreciably by disturbances at greater 
depths. The late Professor H. H. Turner had inferred from the times of travel of the 
bodily waves that a few earthquakes originated at depths up to some 400 km., but the 
evidence appeared capable of other interpretations. Examination of the seismograms of 
these earthquakes by Stoneley, however, showed that the surface waves were absent, and 
this fact was not explicable by any of the other suggestions, but was just what would be 
expected from the reciprocity principle if the earthquakes in question originated at great 
depths. 
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Equal roots 

8*09. Case of equal roots. In the discussion of the oscillations of dynamical systems 
about equilibrium the ordinary method of seeking for solutions of the form x s — \ s e yi 
meets with a difficulty when the determinantal equation for y 2 has equal roots. In the 
ordinary way, if we have a set of simultaneous linear differential equations for n variables, 
and we eliminate them in succession in favour of one, we get a differential equation for 
that one. If we substitute e yt for it we shall get an equation for y, and if there is a repeated 
root there will be a second solution W l . If this happened in the theory of small oscillations 
it would appear that a repeated value of y would lead to terms of the form t cos Kt, t sin kI 
(k — iy), and except for special initial conditions a small oscillation would grow indefinitely. 
This was never found to happen, and in fact if it could happen it would contradict the 
fundamental principle that if the potential energy is a minimum in the position of equili¬ 
brium, and the initial displacements and velocities are sufficiently small but not zero, 
there is a limit that no displacement can ever exceed. Laplace was puzzled, and the 
explanation was finally given by Routh* and Heaviside.f If the system is not dissipative 
and the roots are unequal, we know from 4*082 and 4*09 that the zeros of the minor of any 
element in the leading diagonal separate those of the original determinant, and if the 
determinant A has a factor {p 2 + a 2 ) k , every first minor contains the factor (p 2 + a 2 ) k ~ 1 . 
Hence when we evaluate the contribution from the initial conditions to the operational 
solution, namely, from 7*061 (7), 

x m = -^a r s(P 2 u s +pv 8 ), 

a factor (p 2 + a 2 )* -1 will cancel and we are left with only a single factor (p 2 + a 2 ) in the 
denominator. The same will happen for every repeated root, and the interpretation will 
contain only terms of the forms cos at and sin at. Varying u 8 and v s will alter the ratios of 
the coefficients of these trigonometric factors for different coordinates; it will not intro¬ 
duce terms like t cos at or t sin at. 

8*10. Dissipative and gyroscopic systems. Here the root separation theorem may 
not hold. Then the operational solution may have a repeated factor in the denominator 
and terms like te~ at , t cos at, t sin at may occur in the interpretation. We have had a simple 
instance of this for a dissipative system in the aperiodic seismograph. This will not affect 
stability if the undamped system is stable and non-gyroscopic, since the solutions are 
exponentially decreasing and will still tend to 0 with increasing t. But if a system is kept 
stable only by gyroscopic action, coincidence of the roots may ruin the stability. Suppose 


that the equations satisfied by two coordinates x lf x 2 are 

x x — bxg + c^ = 0, x 2 + bx x + c 2 x 2 = 0. (1) 

Assume x 1 — ^. x e yt , x 2 = A 2 e^. (2) 

We find that y must satisfy the determinantal equation 

y 2 + c x —by = 0, (3) 

by y 2 +c 2 

that is, y 4 + (Ci + c 2 + b 2 ) y 2 + c 1 c 2 — 0. (4) 


A necessary condition for stability is that both values of y 2 shall be real and < 0. Hence 
c x c 2 > 0, and we have two cases according as c x and c 2 are both positive or both negative. 


♦ Stability of a given State of Motion, 1877. 


f Electrical Papers, 1, 529. 
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Take first the case where they are both positive. Then the system would be stable even 
if b was 0. The condition for equal roots is 

(Ci + c-s + fi 2 ) 2 = 4c x c 2 , (5) 

that is, (c 1 -c 2 ) 2 + 26 2 (c 1 + c 2 ) + 6 4 = 0. (6) 

With Cj, c 2 > 0 this can be satisfied only if 6 = 0, and then c x = c 2 . Hence if (3) has equal 
roots and y 2 is equal to one of them, all elements of the determinant vanish. This is what 
we should expect, since c x x\ + c 2 x% is a positive form in this case, and the root separation 
theorem still holds in a gyroscopic system when the terms c ra x r x a are a positive form. 

If, however, c x and c 2 are both negative (6) can be satisfied, provided that 

62 = _ ( Cl + C 2 ) ± J{{c x + C 2 ) 2 - (Cj - c 2 ) 2 } 

= - (Ci + Ct) ± 2 V(<h c 2 ) = (V(-c x )±V(- c 2 )} 2 . (7) 

Thus the determinantal equation can have equal roots in this case without 6 being zero. 
It reduces now to 

y 4 ± 2 <j(c x c 2 )y 2 + c x c 2 = 0 

and y 2 = +V( c i c z)- (8) 

In this case the separate elements of the determinant (3) do not vanish, though y 2 is still 
real and negative. With the lower sign in (7) and (8), y 2 would be positive and the 
system obviously unstable. We therefore take the negative sign in (8) and the positive 
sign in (7). To see what will happen to a system satisfying these conditions, with 
x i ~ u i> x i — 0* x 2 = 0, x 2 = 0 at t = 0, we write 




c x = - a 2 , c 2 = -/? 2 , b = a+fi, 

(9) 



{p i -oc 2 )x x -(oi+P)px 2 = p 2 u x , 1 

(oi + fi)p x x + (p 2 -fi 2 )x 2 = (a+jS)pu v j 

(10) 

The operational solution is 




_p*+ (a 2 4- 2a/?) p 2 (* + P)oc 2 p^ 

1 (p 2 + afi) 2 15 2 “ (p 2 + afi) 2 Ul * 

(11) 

and 

u x 


(12) 


u x 

= 2(a/3fl* ( sin V( a A) t ~ t c °s V( a A) *}• 

(13) 


Hence if a system is kept stable only by the gyroscopic terms, and the coefficients of 
these are such as to make the periods equal, the stability may be ruined in the sense that 
the amplitude of a disturbance will increase linearly with the time. This corresponds to the 
top with C 2 n 2 = 4cAMgh. 

8*101. Gyroscopic system with slight friction. Here we need not consider a 
general initial disturbance, but confine ourselves to the period equation. If the equations 
of motion are 

X 1 +J X 1 + C 1 X l’~ bx 2 = °» X 2 +f X 2 + C 2 X 2 + bx 1 = 0» (1) 

where c x < 0, c 2 < 0,0, and 6 is large enough to ensure stability when/is put equal to 0, 
we assume solutions proportional to e? 1 and find that y must satisfy 

y4 + 2/y 3 + y 2 ( Cl + c 2 + 6 2 +/ 2 ) +fy(c x + c 2 ) + c x c 2 = 0. 


( 2 ) 
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Hence 

M 

n» 

II 

1 

/A 

© 

M 

^ 11- 
II 

1 

+ 

\v 

o 

(3) 


Let the roots for /= 0 be ± in x , ± in 2 , with n x >n a ; and for / small and > 0 let them be 
± in x — cc x , + in a — oc a to order /. Then we have to this order 

ct x + cc a = /> 0 , 


1 111 2 <x x 2a» . 

--1-;-f- --1-.- 4 s - 2 - 2 > 

in x — OL x — m x — a x in a — <x a — m a — <x a n{ n\ 

These inequalities are consistent only if <x x and a a have opposite signs, and in fact, since 
n x > n 2 , ct x > 0, a 2 < 0. Hence if a system is kept stable by gyroscopic action only, the effect 
of small friction is always to produce instability. The quicker free vibration will be 
damped, but the slower will increase in amplitude with time. 

This feature of gyroscopic motion has considerable theoretical and practical importance. 
If we use the usual method to treat small oscillations about steady motion, neglecting 
friction, we often find that all the roots y are purely imaginary, and infer that the system 
is stable. If the expression c rs x r x s is essentially ^0, and there is a little friction, 
a rs x r x s + c r8 x r x s will decrease, and the oscillations will be gradually damped down. Such 
systems are called secularly stable. But if the quadratic form in question is not essentially 
^ 0 and the system is kept stable only by the gyroscopic terms, the slower oscillation about 
steady motion will gradually increase in amplitude until it can no longer be treated as 
small, and may lead to a complete change in the character of the motion. Such systems 
are called ordinarily stable but secularly unstable. The engineer tries to avoid them. 
They have possibly had considerable importance in the development of stellar systems, 
and of the solar system in particular. 


8*11. Radioactive disintegration. The uranium family of elements are such that 
an atom of any of them, except the last, is capable of breaking up into an atom of the next 
and either an atom of helium (a-particle) or a free electron (/7-particle). The emitted 
particle leaves the atom and has no effect on the later stages. The number of atoms of 
any element that break up in a short interval of time is proportion to the time interval 
and to the number of atoms of that element present. * If u, x x , x 2 ,..., x n are the expectations 
of the numbers of atoms of the various elements present at time t, they will satisfy the 
differential equations 

du 1 


dx x 

dt 

dx 2 

dt 


= ku — k x x x , 


= k x x x -k 2 x 2 , 


( 1 ) 


dx n 

* Strictly speaking, since radioactivity is a random process, this rule is true of the expectation 
of the number of atoms breaking up. The actual number will deviate somewhat from expectation, 
but we can neglect the difference if the expectation is large. 
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Suppose that at t = 0 only uranium is present; then u = u 0 , and all the other dependent 
variables are zero. The subsidiary equations are 


{p + K)u 

= pu 0 , 

(.P + K 1 )X 1 

= KU, 

(p + * 2 )a; 2 

= *1*1 > 

{P + K n - l)*»-l 

= *n— 2 * 71 —2> 

P x n 

== *n—l *»-i». 


and the operational solutions are written down immediately: 


5_ x _ _ o _ kk iP u 0 

p + K ’ 1 {p + K){p + K x )' 2 (p + K){p + K x ){p + K t )* 

x _ _ KK \ • • • ^n-l^o _ 

{p + K){p + K 1 )...{p + K n _ x Y 

These are directly adapted for interpretation by the partial-fraction rule; in fact 


K1J 

u = u 0 e~ Kt , x x =-2- (e-^ - 


K-.—K 


X* = /CAC, W n f ;-- e _/f< H--- #»-<!< -I_1_ P-kA 

t(/q-Ac) (ATjj — a:) (/c-ATj) (*2-/^) (act—A r a ) (/Ci-ac 2 ) /’, 


W 


(3) 


( 4 ) 


“ 


*n~l**0 


(*l~*)--.(*n-l-*) 


-Kt_ 


( 6 ) 


Of all the decay constants k is much the smallest. If the time elapsed is long enough for 
all the exponential factors except e~ Kt to have become insignificant, the results reduce 
approximately to 

u = u 0 e-«<, x 1 = ^ r u 0 e-*‘, x 2 = ~u Q e~ Kt , (0) 

*1 AC 2 

Xn = u (7) 


W'ith the exception of the last, the quantities of the various elements decrease, retaining 
constant ratios to one another in the inverse ratios of their decay constants. 

On the other hand, if the time elapsed is so short that unity is still a first approximation 
to all the exponential functions, we can proceed by expanding the operators in descending 
powers of p and interpreting term by term. Hence at first x x will increase in proportion 
to t , x t to t 2 , and x n to t n . 

In experimental work an intermediate condition often occurs. Some of the exponentials 
may become insignificant in the time taken by the experiment, while others are still 
nearly unity. We have 

K CC 

X r = -KrP~ 2 Xr-l+ ...)> (8) 


and if K r t is small we can neglect the second and later terms in comparison with the first. 
Hence in this case 

X T^ K r-lP~ l Xr-l- (9) 


J HP 


17 
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If x r-1 is of the form we have from (8) 


811 


T (P + K r)P S V k+X P + K r P S+1 


( 10 ) 


If K r t is smafi we can replace the exponential by 1 and confirm (8). But if it is great 


and on continuing the integrations 


Hence 


: *V-i 

K, 


\- K i*dt — —-1 -0(e~ K r*), 

(11) 

K r 


, 1 1 <> 

(12) 

■=p~ s — = — — r 

K r K r S\ 

~t S = 

K r 

(13) 


Classifying elements into long-lived and short-lived according as K r t is small or large for 
them, t being the duration of the experiment in question, we find that the quantity of the 
first long-lived disintegration product increases in proportion to t, the second to f 2 , and 
so on. Short-lived products vary nearly in proportion to the previous long-lived one. All 
/9-ray products are short-lived when t has ordinary values. 

Radium is the third a-ray disintegration product of uranium. In rock specimens 
the time elapsed since formation is usually such that the relations (6) have become 
established. As a matter of observation the numbers of atoms of radium and uranium are 
found to be in the constant ratio 3*58 x 10 -7 . This determines k/k 3 . Also the rate of break¬ 
up of radium is known directly; in fact 


1 /k z = 2280 years. 


Hence 1/k = 6-37 x 10 9 years. 

T his gives the rate of disintegration of uranium itself.* 

A number of specimens of uranium compounds were carefully freed from radium by 
Soddy, and then kept for ten years. It was found that new radium was formed, increasing 
like the square of the time. This would suggest that of the two elements between uranium 
and radium in the series one was long-lived (in comparison with ten years) and the other 
short-lived. Actually, however, it is known independently that both are long-lived. The 
first, however, is chemically inseparable from ordinary uranium, and therefore was 
present in the original specimens; initially, instead of x x = 0, we have 


K 

«i = — u Q . 
k i 

For the next element, ionium, we have 

x 2 = K 1 p~ 1 x 1 = KU 0 t, 

and for radium x 3 = K 2 p~ x x 2 = ^KK 2 u 0 t 2 . 


* The numerical data used here have been revised in later experimental determinations, but it 
has also been found that the series branch and reunite to some extent, so that to take the more 
recent results into account would complicate the analysis without introducing any new principle. 
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Soddy* found that 3 kg. of uranium in 10*15 years gave 202 x 10 -ia g. of radium. Hence, 
allowing for the difference of atomic weights, 


and, k being known, 


x 3 ju Q = 7*1 x 10 -14 , 
k 2 — 8*64 x lO^/year, 1 //c a = 1*16 x 10 6 years. 


This gives the rate of degeneration of ionium. Soddy gets a slightly lower value of 1/k 2 
from more numerous data. 

In another case that sometimes occurs in experiment the specimen has been found in 
nature with the various products in approximately the ratios indicated by (6), but the 
uranium and possibly some later products are then removed chemically, and the behaviour 
of the remainder is studied. A solution for this case has been given by W. F. Sedgwick, f 
The operational treatment, the possibility of which was suggested by A. F. Crossley, is 

as follows. Let x 8 at t = 0 be 0 for s<r, and u s for Or, where u 8 = — u r except for r = n. 

k 8 

The subsidiary equations are now 


We have for r<8<n 


that is, if 


(p + K r )x r =pU r , 

(p + K 8 )x 8 = pu 8 +k 8 _ j x 8 _ x (r <8 < n), 

P x n ~ P u n + K n-l X n-l‘ 

{p + K 8 )x s = (p + K 8 )u 8 - k s u 8 + k 8 _ x x 8 _ x 
= (P + *«) u 8 - AC 8 _iK_i - * a _i), 


U 8~ X s = y 8 > 

( P + K 8 )y a = Ks-lVs-l, 

Wlth (P + K r)y r = K r U r , py n = - K n _ x U n _ x + K n _ x y n _ x . 

The operational solutions are therefore 

k~u. kIu, K*K r+x U r 


(14) 

(15) 

(16) 

(17) 

(18) 

(19) 

( 20 ) 


Vr = 


P + «r 


Vr +1 = 


(P + K r )(P + K r+l ) 


> Vr+2 ~ 


(P + K r ) (p + K r+1 ) (p + * r+2 ) * 


1 1 KfK r ,,...K„ 9 U r 

7/ —_ K oi _i_ r r +i _ n—£ r 

n p r r p{p+K r )...{p+K n _ x y 


Therefore 


y r = u r (l - e~ Kri ), x r = u r er K ^\ 


( 21 ) 

( 22 ) 


and since K r u r = k s u 8 each y 8 except y n tends to u 8 when t tends to infinity, by the partial 
fraction rule. Hence each x 8 except x n tends to 0, as would be expected; 


x. 


'r+l 


= U 


'r+l 


( —7 Kr+1 - - % er«*+ 

Wr+l-^r 


■) AC r+1 (/C r -iC r+ i) 


e~ Kr +i l \ 


u 


'r+l 


K, 


r+l 


■ K, 


and so on. 

* Phil. Mag. (6) 38, 1919, 483-88. 


^ r+1 e-M — e~*r+i^j, 


(23) 


f Proc. Canib. Phil. Soc. 38, 1942, 283. 


1 7-2 
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The interesting case occurs when the duration of the experiment is such that some of 
the earlier exponentials have time to become small. The short-lived earlier products then 
disappear during the experiment. If the first not to become small is we have 
approximately 


x„ — 


K r K. 


r+l 




(24) 


so that this element decays nearly as if the others were not present; and the decay of later 
elements in the series will follow the same law so long as there is no intervening element 
with K a ^K m . For those with longer lives, however, x 8 will contain a term in s - ***, which 
may be larger. 


EXAMPLES* 

1. A light string of length 31 is stretched under tension P between two fixed points. Masses 5m 
and 8m are attached at the points of trisection. A small transverse velocity u is given to the particle 
of mass 6m. Prove that the displacement of the other particle is 


6 u ( 120 . / 3 . aA 

-1 / —sm / — xt — J2am— I, 

14 a W 3 V 20 V 2 / 

where a* = P/ml. 

3. A Galitzin seismograph is so adjusted that 

= K t = k, n? = «£ = 2/c*. 

Prove that the response to a unit impulsive change of velocity is 

A/t 


(M.T. 1929.) 


2k* 


(Kt sin Kt — sin Kt + Kt cos Kt) e~ Kt . 


3. Prove that if a: is the displacement on the record given by a Galitzin seismograph, due to 
an impulsive change of velocity of the ground. 



xdt = 0 


whatever the constants of the instrument may be. 


* Numerous examples are given by G. W. Carter, The Simple Calculation of Electrical Transients, 
1944. 
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NUMERICAL METHODS 


I have no satisfaction in formulas unless I feel their numerical magnitude. 

lord kelvin. Life by Sylvanus Thompson, p. 827 

9*01. Approximation by polynomials. The characteristic feature of most numerical 
methods is that values of a function f(x) are given for a set of distinct values of x , 
but not for intermediate values; for purposes of computation these are filled in on 
the hypothesis that f(x) can be replaced by a polynomial agreeing with f(x) at the places 
where its values are given. The simplest case is that of linear interpolation, in which only 
two adjacent values of f(x) are taken from a table and intermediate values are calculated 
on the supposition that f'(x) is constant in the interval. This procedure is accurate provided 
that f (x) changes little in the interval, but cases often arise that require allowance for 
higher derivatives. The use of a polynomial for fitting can never be mathematically exact 
unless f(x) is itself a polynomial, but it can, in suitable circumstances, be as accurate as 
the tabulated values themselves. 


9*011. Lagrange’s interpolation formula.* Let f(x) be given for x = x v x 2 , ..., x n 
Then the function “ 4 


g(x) = f(x x ) 


(x — x 2 ) (x — x 3 )... ( x-x n ) 

(X\ ~ x z) (#1 ~ x z) • • • (*1 ~ X n) 


+/W 


(X-Xj) (x-x 3 )...(x-x n ) 
( X 2 — x-j) (x 2 — x 2 )... (x 2 — x n ) 


+ ... 


+f( x n) 


( X n ~ x l) ■ • • ( x n ~ x n— l) 


( 1 ) 


tends to f(Xj) for x = x lt to f(x 2 ) for x = x 2 , and so on. Also it is a polynomial of degree 
n “ !• It is symmetrical in the sense that it is unaltered by any interchange of the suffixes; 
the tabulated values can therefore be taken in any order. 

Most interpolation formulae can be derived from this. It is not usually convenient on 
account of the fact that in practice g(x) will usually be determined mainly by the adjacent 
tabulated values, so that linear interpolation will need only a small correction; but all 
the arguments appear symmetrically in (1) and the contributions from all terms will 
need to be taken into account. Computations are made easier by using a form that 
makes the special dependence on neighbouring values explicit and therefore by aban¬ 
doning the symmetry. 


9*012. Divided differences. The values of x r and f(x r ) are first arranged in a table 
x i--- x n- For any two consecutive arguments x r and x r+1 we form the ratio 

fli x r x r+l) = [ x r x r+l\ = * (2) 

x r +1 x r 

* Really due to E. Waring, Phil. Trans. 69 (1779), 69-67; Euler rediscovered it in 1783. 
Lagrange s publication was in 1795. It must however have been obvious to Newton. For his torical 
references see Karl Pearson, Tracts for Computers, 2 (1920). 
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The notations/j and [x r x r+x ] are both in frequent use. This ratio is called the first divided 
difference. We then form 

f%( x r X r+1 *r+2 ) “* [*r*r+l*r+2] ~ * ' ' 

• c r+2 

This is the second divided difference ; so we proceed, the divisor at each stage being the 
difference of the two values of x each used in only one of the differences subtracted. Now 
consider a general x not equal to any of x lt x 2 , ...» * n . The divided differences involving x 
exist; and by definition 

[xxj] = — — , whence /(*) = /(* x ) 4- [**i] (x - x x ), 


X-.—X 


[xx x x 2 ] = whence [xx x ] = [*!* 2 ] + t**^] (x-x t ), 


( 4 ) 

( 5 ) 


**-* 


[xx x x 2 ...<| = [^ 1 ^ 2 - . ^ l] # (6) 

*» * 

whence [xx x x 2 ... * n _i] = [*i ••• * n ] + [**i* 2 ••• *«] (*■“*»)• 

Substitute for [xx x ] in the first identity its value given by the second; we get a three-term 
relation involving [xx x x 2 ]. Substitute for [xx x x^\ its value given by the third identity 
and proceed. We have finally 

f(x) = f(x x ) 4- (* — x x ) {[* 1 * 2 ] "b(*~ x 2 ) {[* 1 * 2 * 3 ]+{... + (* — *»-i) [* 1*2 • • • *«]}• • •))+(®) 
where R(x) = [xx x * a • • • * J (* - * 1 ) (* - *2 )•••(*-*»)• ( 9 ) 

Expanding the series we have 

/(*) = /(* 1 ) + (*-* 1 ) [* 1 * 2 ! + (*“ * 1 ) (* - * 2 ) [* 1 * 2 * 3 ] + — 

+ (* — * 1 ) • • • (* *»— 1 ) [* 1*2 • • • *»] "b R{x) 

= P(x) + R(x), (H) 

say. This is a pure identity arising out of the definition of divided differences. Its utility 
depends on the value of R{x), which is not known from the definitions unless f(x) is; but 
if we can fix limits to R(x) otherwise, we thereby get limits to the error involved in omitting 
it. Then P(x )will be apolynomial of degree n - 1 representing/^) with assignable accuracy. 
Now consider the divided differences of af, where r is an integer. We have 

te,i = ...+* r - 1 , (12) 

which is a polynomial of degree r- 1. This can be extended at once to any polynomial 
of degree r. Therefore the rth divided difference of a polynomial of degree r is a constant 
and all higher ones are zero. 

Now/(z) is equal to Lagrange’s interpolation function g(x) whenever x = x x , x 2 , ..., x n . 
g{x) therefore has the same divided differences based on those n points as /(*). Con¬ 
sequently if we apply (10) to g(x) we shall obtain 

g{x) = P(x) 4- [ xx x ... * n ] (* * 1 )... (* *«— 1 ) (* *»)> 


(13) 
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the divided difference in the last term being the nth divided difference of g(x). But g(x) 
is a polynomial of degree n— 1 and therefore its nth divided difference is 0. Hence 

g(x) = P(x) (14) 

for all values of *. Therefore if we define R(x) as f(x) - P(x), R(x) = 0 for x = x x , x 2 ... x n . 
This is not obvious from the occurrence of a zero factor in (9), because the divided difference 
is not defined when two entries coincide. 

Now suppose that f(x) has derivatives up to the nth in a range (a, b). Then the same 
applies to R(x), since g(x) is a polynomial. Suppose x v x 2 , ..., x n arranged in ascending 
order. Since g(x) is symmetrical this does not affect R(x). Then by Rolle’s theorem, since 
R(x) = 0 for x = x x and x = x 2t R'(x) = 0 for some intermediate x; similarly R’(x) = 0 for 
•some x in each of the ranges x % to x a , ..., x n _ x to x n . Again applying Rolle’s theorem we 
see that R"(x) = 0 for n—2 values of x between x x and x n , and proceeding we have 
E n ~ x) {x) = 0 for one intermediate value, say x = £. But by differentiating (8) we have 

f {n ~ x) {x) = (n -1 )! [x x x 2 ... x n ] + 

and therefore / (n-1) (£) = (»-l)! [x x x 2 ... x n ]. 

Hence there is at least one value of x between x x and x n such that the (n- l)th divided 
difference is l/(n— 1)! times the (n— l)th derivative of f(x). This result is true for all n; 
hence we can replace n — 1 by n and infer that 

[**i •••*„] (17) 

where ij is within the range whose end-points are the least and greatest of x, x x and x n . 
Then returning to (9) we have 

R ( x ) = f (n) (v) {*-x x ) (x — x 2 )... (x—x n ). ( 18 ) 

If we can fix limits to the nth. derivative in any range of n values of the argument, we can 
therefore fix a limit to the error introduced by using P(x) instead of f(x). The result of 
neglecting R(x) in (10) is Newton’s interpolation formula. 

It is obvious from successive applications of (12) that the nth divided difference of 
is 1. It also follows from (17). For if f(x) = x n the right of (17) is 1 for all a; and the 
numerators of all higher differences are 0. This fact is convenient in the fitting of power 
series to given values of a function. For if the nth divided differences are all found to be 
o«, a n is the coefficient of x n . We subtract a n x n from all tabulated values, and again form 
divided differences. If the arithmetic has been done correctly the differences of order n — 1 
will be constant, and this value will be the coefficient of x n ~ x , and so on. The process 
is self-checking, any arithmetical mistake being detected in the next stage of the 
calculation. 

From the form (18) it is clear that if f( n \x) does not vary greatly R(x) will be least if 
the tabular values used, x x to x n , are as nearly as possible symmetrically placed about x. 
The formula is valid for any set of tabular values, but the error inevitable if f (n \x) is not 
zero is much less for interpolation than for extrapolation. Similarly, we shall ordinarily 


(15) 

(16) 
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get greater accuracy in interpolating in the middle of a table than near the ends. The 
table of differences will be set out as follows: 


*, /(*,) ^ Cwd 

* s Lv [*2*3*4] 

*4 /(*d Ly [*3*4*5] 

*5 /(*s) 4 * [*4*5*a] 


/a 

[*1 *2 *3 *4] 

[*2*3 *4 *5] 

0 3 a; 4 a: 6 a; 6 ] 


If we wish to interpolate between x 3 and x 4 the contributions from f 3 will be least if we use 
the differences [x 3 x^, [x 2 x 3 x 4 ], and [x 2 x 3 x 4 x 5 ], or any other sequence that ends in the 
same third difference. It is necessary that the second difference used shall be one of the 
two used in forming the third difference, and that the first difference used shall be one of 
those used in forming the second difference used. Otherwise it is irrelevant what route 
we choose so long as we end with the same third difference; the results will always be values 
of the cubic polynomial that agrees with f(x) at x — x 2 , x 3 , x 4 , x 5 . But the arithmetic is 
easier if we keep as nearly as possible to a horizontal line of the table.’ 

The form (10) is usually less convenient than (8); the higher divided differences 
are usually small and it is troublesome to keep track of the decimal point when they 
are multiplied by two or more factors. Using (8) we begin at the end and calculate 
(x — x n _ x ) [x x x 2 ... x n ]. We add this to [x x x 2 ... x n _ x ]; multiply the result by (x — x n _ 2 ), add 
to [x x x 2 ... x n _ 2 ] and so work back to the beginning. A better way still is to compute the 
first two terms directly; these represent the result of linear interpolation, and can be 
built up directly on a multiplying machine. First f(x x ) is set up and transferred to the 
product register by one turn of the handle. Then [aqa? 2 ] is set up and multiplied into 
all values of x — x x required up to and including x 2 —x x . The last should give f(x 2 ) 
and check the calculation of [x x x 2 ]. The multiplier register should be cleared before 
multiplication begins so that the successive values of x — x x can be read directly on it. 
We then write (8) in the form 

f(x) = {f(x x ) + {x- aq) [x x x 2 ]} + {x-x x ) (x - x 2 ) {[x x x 2 x 3 ] + (x-x 3 ) {fo* 2 * 3 * 4 ] + ...}}• 

The last batch of terms will be a small correction in most cases, and those in the first 
brackets have already been built up on the machine. It is desirable to work to one more 
figure than is given in the data in order to prevent accumulation of rounding-off errors. 

The standard numerical methods all depend on replacing f(x) by the interpolation 
polynomial P(x). which is of degree one less than the number of data. In general/(a:) P(x) 
except at the datum values, but the difference lies within assignable limits. P{x) is the 
smoothest function that agrees with f(x) at the required points, since d n P(x)/dx n — 0 
for all x\ this would not be true of any other function. 


9*013. As a specimen of the method let us take some irregularly spaced values of sin x a 
and interpolate to multiples of 5°. The data and the divided differences are as follows: 


x sin x° 


fi 


fi 


/a 


0 

13 

24 

37 

54 

67 

79 

90 


0-0000 

0-2250 

0-4067 

0-6018 

0-8090 

0-9205 

0-9816 

1-0000 


0-2250/13 = 0-01731 
0-1817/11 = 001652 
0-1951/13 = 001501 
0-2072/17 = 0-01219 
0-1115/13 = 0-00858 
0-0611/12 = 000509 
0-0184/11 = 0-00167 


0-00079/24 = -0-000033 
0-00151/24 = -0-000063 
0-00282/30 = -0-000094 
0-00361/30 = -0-000120 
000349/25 = -0-000140 
0-00342/23 = -0-000149 


-0-000030/37 = -0-0000008 
-0-000031/41 = -0-0000008 
-0-000026/43 = -0-0000006 
-0-000020/42 = -0-0000005 
-0-000009/36 = -0-0000002 
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The interpolation runs as follows. The second column gives the part built up by linear 
interpolation between the two adjacent datum values, the third the correction for / a 


and / 3 . 

Linear 




Correct 

* 

formula 


Correction 

sin x° 

value 

5 

0-08655 

- 5 X 

8( - 0-000033 +19 X 0-0000008) = 40 X 0-000018 = 0-00072 

0-08727 

0-0872 

10 

0-17310 

—10 x 

3( - 0-000033 +14 x 0-0000008) = 30 x 0-000022 = 0-00066 

0-17376 

0-1736 

15 

0-25804 

- 2 x 

9( - 0-000033 -15 x 0-0000008) = 18 x 0-000045 = 0-00081 

0-25885 

0-2588 

20 

0-34064 

- 7 x 

4( - 0-000033 - 20 x 0-0000008) = 28 x 0-000052 = 0-00137 

0-34201 

0-3420 

25 

0-42171 

- lx 

12( -0-000063- 12 x 0-0000008) = 12 x 0-000073 = 0-00088 

0-42259 

0-4226 

30 

0-49676 

- 6 x 

7(-0-000063-17 X 0-0000008) = 42 x 0-000077 = 0-00323 

0-49999 

0-5000 

35 

0-57181 

-llx 

2( - 0-000063 - 22 x 0-0000008) = 22 x 0-000081 = 0-00178 

0-57359 

0-5736 

40 

0-63837 

- 3 x 

14(-0-000094-16 x 0-0000006) = 42 x 0-000104 = 0-00437 

0-64274 

0-6428 

45 

0-69932 

- 8 x 

9(-0-000094-21 X 0-0000006) = 72 x 0-000107 = 0-00770 

0-70702 

0-7071 

50 

0-76027 

—13 x 

4(-0-000094-26x0-0000006) = 52 x 0-000110 = 0-00572 

0-76599 

0-7660 

55 

0-81758 

- lx 

12(-0-000120-18 x 0-0000005) = 12 x 0-000129 = 0-00155 

0-81913 

0-8192 

60 

0-86048 

- 6 x 

7(-0-000120-23 X 0-0000005) = 42 x 0-000132 = 0-00554 

0-86602 

0-8660 

65 

0-90338 

-llx 

2( - 0-000120-28 x 0-0000005) = 22 x 0-000134 = 0-00295 

0-90633 

0-9063 

70 

0-93577 

- 3 x 

9( - 0-000140 -16 x 0-0000002) = 27 x 0-000143 = 0-00386 

0-93963 

0-9397 

76 

0-96122 

- 8 x 

4(-0-000140-21 x 0-0000002) = 32 x 0-000144 = 0-00461 

0-96583 

0-9659 

80 

0-98327 

- lx 

10( — 0-000149-13 x 0-0000002) = 10 x 0-000152 = 0-00152 

0-98479 

0-9848 

85 

0-99162 

- 6 x 

5(-0-000149-18 x 0-0000002) = 30 x 0-000153 = 0-00459 

0-99621 

0-9962 


The results of the interpolation are given in the last column but one, and the values taken 
directly from the tables in the last. It will be seen that the difference only once exceeds 
a unit in the fourth place, and is fully accounted for by the fact that errors from neglect 
of the fifth decimal would run up to half a unit in the fourth place both in the datum 
values and in those used for comparison at the end. At 45° the contribution from / 3 
amounts to 10 units in the fourth place, and an error of half a unit in the last figure of/ 3 
would contribute 0-7 in the fourth place of the interpolate. It is only for rather wide 
intervals such as these that the third difference matters in interpolation to four figures. 

Since the first difference used in each case is based on the two adjacent values, the 
coefficient of/ 2 is always negative. From 15° onwards the second difference used is that 
given on the same horizontal line as the beginning of the interval, and the extra datum 
value used in forming it is the one before the beginning of the interval. Consequently 
x — x z , the new factor multiplying/ 3 , is positive. This is not possible in the first interval; 
the second difference used is opposite the end of the interval and involves the datum 
at 24°. Hence the factor multiplying/ 3 is negative in this range. 

An increase of accuracy is sometimes possible if the derivative of the function is known 
for some value in the range. Here, for instance, we know that at 90° the derivative of 
sin* is 0. This can be treated by extending the table one line as follows: 


x sin x 

67 0-9205 

79 0-9816 

90 1-0000 

90 1-0000 


fi 

0-00509 

0-00167 

0-00000 


/a 




0-000149 

0-000152 


-0-000003/23 = -0-0000001 


Between 79° and 90° we can now use the formula 


sin* = 1 *0000 - 0*000152(*-90) 2 -0*0000001 (*-90) 2 (*-79). 

9*02. Interpolation with equal intervals. The formation of divided differences is 
rather laborious, but cannot be avoided when the intervals of the argument and the 
function are both irregular. If the intervals of the argument are all equal it can be replaced 
by simple subtraction. Two classes of formulae are available: the Gregory formula and 
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what is called the Gregory-backward* formula, on the one hand, and the various central 
difference formulae on the other. The Gregory formula corresponds to the method that we 
used near the beginning of the range in the above illustration, whereas the central differ¬ 
ence formulae use as far as possible differences near the same horizontal line of the table. 
For the same reason as with divided differences the latter will be the better when they 
can be used, since the higher differences will be multiplied by smaller factors. 

With equal intervals we form the differences as follows: 


A Vr = Vr+i - Vr> A *y, = A y r+1 - A y r = y r+2 - 2y r+1 + y r , etc. 

each difference being formed by subtracting the two nearest entries to its left. This 
notation is the most convenient when the Gregory formula is being used. With central 
differences and backward differences other notations are more convenient, but the actual 
entry in each position in the table is the same: 



Forward differences 
.-•---- 

Central differences 

— _A_ 

Backward differences 

y-* 

y-i 

yo 

Vi 

y* 

&y -1 

A y-! 
Ay 0 

Ay x 

A V-a 
AV-i 
A 2 y 0 
A 2 2 / x 

AV* 

A *y-t 

A 3 y 0 

s y-% 

s v^h 

Syi h 

dy, k 

s 2 y 0 

&Vt 

32 y t 

* z y-ih 

t 3 yy, 

* 3 y*k 

Vyi 

Vy t 

I! 3 '* 

v Vl v»y, 

iz* 4: 


It is evident from the mode of formation that the first divided differences in corre¬ 
sponding positions would be Ay/h, the second A 2 y/2& 2 , and the nth A n yjn\h n . Then if we 
use differences based on x 0 , x Q + h, x 0 -\-2h, ... we have at once from Newton’s formula 


in 


where jR m+1 is found from the remainder in 9-012 (18) to be 


6(d-l)...(0-n) (dn + ly\ 

(71+1)! W* n+1 /*-/ 

This is Gregory s formula, discovered by James Gregory in 1670; Newton’s more general 
formula was published in 1687. We see that it has the form of a binomial series, y 0 being 
preceded by the operator (1 + A) 5 . It has in fact an operational interpretation. If we define 
D as meaning d/dx, Taylor’s theorem may be written 

/(* + «) = (l+«^+f*£?+•••)/(*) = (2) 

If we also write f{x + h) = Ef(x) = (1 + A )f(x) = e hD f(x), (3) 

we have f( x + 6h) = e ehD f(x) = (1 + A ) e f{x). (4) 

These operators occurring in interpolation theory are fundamentally different from 
those of Heaviside s methods; here the fundamental operator is D, whereas in Heaviside’s 
methods it is p~ x , which is not simply the inverse of D because the two do not commute. 
The justification of their use is therefore quite different. Expansion in powers of p - 1 is 
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justifiable in a much wider class of cases than that in powers of D. The infinite Taylor 
series becomes meaningless if the function operated on has no derivative above a certain 
finite order at some point of the range, but the p -1 series requires nothing more of the 
function than that it should be integrable. The justification here rests on the fact that in 
interpolation the function is replaced by a polynomial P(x), with an accuracy fixed by 
the lowest difference neglected. The terms of the Taylor series of P(x) containing D n and 
higher powers are all zero if only differences to order n— 1 are used. The operational process 
in powers of D is therefore valid in problems of interpolation because it is carried out only 
on the interpolation polynomial, not on the original function, of which the polynomial is 
only an approximate representation of known accuracy. 

The binomial theorem can be derived from Gregory’s formula. Take intervals 1 of 
the exponent in (1 +x) n for given x; the difference table reads: 


n /(») 

0 1 

1 1+x 

2 ( 1 +*)* 

3 (1+*)* 


A/(n) 


x 

*( 1 +*) 

x(l+a;) a 


A a /(n) A»/(n) 

j4l 


and the Gregory formula based on/(0) and its differences reads 

v , nln — 1) „ n(n— 1)... (n — r+ 1) 

f(n) = 1 + nx+ — -x 2 + ... + - zy - 


2 ! 


n 


(5) 


which is the binomial theorem for a real fractional index.* 

9*03. In the Gregory backwards formula the differences ascending diagonally from x 0 
are used. We have 



f(x„+eh) =/„ + 0A/_ 1 + ^”A 2 /_, +.... 

(6) 

This also can be easily derived operationally. If we write 



/(*r) ~f( x r-l) = Y/far). 

(7) 

we have 

< 

II 

> 

II 

tsi 

i 

(8) 

whenoe 

£ =1-V 

(9) 


fix + eh) = E»f(x) = (1 

(10) 


- /(*)+ ev/(x )++.... 

(11) 


* It appears to be established that James Gregory knew and used Taylor’s theorem as early as 
1670, and therefore had another approach to the binomial theorem at that time. He apparently 
did not publish it on account of a mistaken belief that Newton must have found it too. Brook 
Taylor’s publication of the theorem was in 1712, that of the so-called Maclaurin’s theorem in 1742. 
It is incredible that between the latter two dates nobody thought of putting a = 0 in Taylor’s theorem. 
Maclaurin has three better titles to fame, namely his independent discovery of the Euler-Maclaurin 
expansion and of the ‘Maclaurin ellipsoids’ in hydrodynamics, and the introduction into mechanics 
of the systematic use of rectangular coordinates. Cf. H. W. Turnbull, James Gregory Tercentenary 
Volume (1939) and Mathematical Discoveries of Newton (1945); Bell, Development of Mathematics. 
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This formula is really an extrapolation formula, since it can be used to infer values of the 
function beyond the range tabulated; of course with the usual increase of inaccuracy due 
to the extra range. The same applies to the Gregory formula for 0< 0. 


9*04. Using the central difference notation and a zigzag set of differences following a 
horizontal line as closely as possible we have from Newton’s formula 


h 3 


, dh(dh - h) (Oh + h) (Oh - 2 h) 8% , 

"4* j i 70 4*.«• 


4! 


h 2 




3! 


4! 


d(d-i)(d+i)...(d-n+i)(d+n-i) 

(2n — 1)! /l/a 

, d(e-l)(d+l)...(0-n+l)(O + n-l)(d-n) 

+ (2n)! Jo+ — 


This is the Newton-Gauss formula. An equivalent formula can of course be obtained by 
using the differences 8f_y 2 , d'U 8 3 f-v„ but would be less convenient for interpolating 
between 0 = 0 and 0=1. The zigzag arrangement of the differences used makes the 
formula somewhat awkward to use, but this can be circumvented in three ways. We 
introduce a further symbol fx to indicate the mean of two adjacent elements in the same 
vertical column; thus 


Mfo = W-* + Sfy,), + m. ^ 3 /o = + <%), 

and so on. Then we can rewrite the Newton-Gauss formula as follows: 

/(*<, +eh) = /„+ 6(SU - WU )+£ («%, - li%) +... 

6(0 2 — l 2 ) (0 2 -2 2 )...{0 2 -(n-l ) 2 } 


(2n— 1)! ' 

- /.+ «o + H <> 2 /o + M 3 /o + • • . 




+ 1 2 ) (g 2 - 2 2 ) ••• f 8 - (»- 


(2w-l)! 

. ^ 2 -l 2 )(0 2 -2 2 )...{^-(n-l) 2 }^ . 

**■ (2n)! /o+ —* 


(13) 


(14) 


This is the Newton-Stirling formula. We see that it involves rewriting the difference table 
so that all the entries lie on horizontal lines through the datum values. The even differ¬ 
ences remain where they were, but the odd ones are replaced by means in accordance with 
the definition of (i. 

Alternatively, if we are interpolating between 0 = 0 and 1, we may keep the odd differ¬ 
ences where they are but use mean even differences centred on 0 = We have 

<$ 2n+1 /i/ 9 = 8 2 ff x -8 2n f Q , 

S*"f 0 = 


whence 


(15) 

(16) 
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and the terms in S**, £ 2n+1 of the Newton-Gauss formula can be written 




The last factor is fi8 2n fii , + 

Hence 

/(x t +0A) =/„+m/*+^^V ! /*+ 


2w+ 1 

e{0-\){0- 1) 

3! 


8 2n+1 fik. 


+ ^-iW+3)gzg ^ Wh+ ... 




This is the Newton-Bessel formula * 

A further modification of the Newton-Gauss formula is obtained by using (15) to 
eliminate the odd differences. We find 


' + **f'+*£ri*** 


2n+ 1 


where <j> = \ — 0\ 

also 0(0-1)... (0 + n-l)(0-n) = <f>(<j>- 1)... (0 + »-1) (<j>-n). 


Hence 


/(*) 


...j 


+ 




+ 


0(0 2 — ! 2 ) t.n» 0(0 2 -l 2 )(0 2 -2 i ) 


3! 


5! 


*%+•••}• 


(18) 


This is Everett's formula. 

These three formulae all have special advantages. The Newton-Stirling formula, 
proceeding in terms of differences centred on x 0 , has the terms in the even differences even 
functions of 0, those in the odd differences odd functions of 0. Hence to get values for 
equal and opposite values of 0 we can build up the terms in the odd and even differences 
separately, and then the values oif(x 0 ± 0h) are found by simple addition and subtraction, 
that is, by three turns on a multiplying machine. It is also convenient for deriving 
expressions for the derivatives of the function at the datum values. The advantage of the 
Newton-Bessel formula is that the odd differences after the first are all multiplied by 
functions that vanish at 0 = £. In comparison with the Newton-Stirling formula we 
notice that the maximum of | 0(0 2 — 1)| for 0 < 0 < \ is f, but that of 6(0—1)(0 — \) is 
0-048. If then we neglect the third difference the Newton-Bessel formula is seven times 
as accurate. In other words, if we want the error to be less than half a unit we can neglect 
third differences under 60 units. In practice we often need to retain second differences, 


* These three formulae are actually all due to Newton. The second name in each case is only 
a label. See Karl Pearson’s bibliography mentioned on p. 261. Pearson condemns the Newton- 
Gauss formula, it is true that this formula is never used for computation. But its direct re¬ 
lation to Newton’s formula makes it the easiest to prove, and the others follow from it without 
trouble. 
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Modified differences 

and third differences are sometimes needed. But the need for them is very greatly reduced 
if we partly absorb them into the second differences by using a mean second difference 
and the Newton-Bessel formula. If we do this it is convenient to arrange the difference 
table slightly differently. We notice that 

= i(«7, +«%) = Mv. - - */-%) = M‘h - */-*)> ( 19 ) 

and therefore is half the difference of the two first differences just after and just before 
the interval that we are to interpolate in. Consequently we need not write out the column 
of second differences explicitly. We rewrite Bessel’s formula, accurate to second differences, 
in the form 

f(x o + 0h) =f o +08f lh -ld(l-d)(8fi h -8f_j h ), (20) 

and the last factor, obtained by subtracting alternate first differences, is written instead 
of the second difference. The function of 6 is as follows, for multiples of 0*1: 


e 

- *0(1-0) 

0-0, 1-0 

0-0000 

0-1, 0-9 

- 0-0225 

0-2, 0-8 

-0-0400 

0-3, 0-7 

-0-0525 

0-4, 0-6 

-0-0600 

0-5 

-0-0625 


A more extended table is given by Milne-Thomson and Comrie.* 

Everett’s formula, if second differences are kept, also takes complete account of the 
third differences, and in this respect is even better than the Newton-Bessel formula. The 
coefficients of the second and fourth differences are as follows: 


6 - 

*0(1-0*) 

t*o0(1-0 2 )(4—0*) 

0-0 

-0 

0 

0-1 

-00165 

0-00329 

0-2 

-0-0320 

0-00634 

0-3 

-0-0455' 

0-00890 

0-4 

-0-0560 

0-01075 

0-5 

-0-0625 

0-01172 

0-6 

-0-0640 

0-01165 

0-7 

-0-0595 

0-01044 

0-8 

-0-0480 

0-00806 

0-9 

-0-0285 

0-00455 

1-0 

0 

0 


Allowing for the fact that adjacent values of $ 4 /will in general be nearly equal we see that 
8*f can reach 20 units in a given decimal place without introducing a correction of half a 
unit in that place into the interpolate. Second differences exceeding 4 units require 
attention. If Bessel’s formula is used third differences over 60 units should be retained: 
but it then becomes just as easy to use Everett’s formula. 

Another method, known as the throw-back, is usefully combined with Everett’s formula. 
The coefficients of each difference in this formula keep the same sign across an interval. 
In particular, that of the fourth difference is (4 — 6 2 )/20 times that of the second, and this 
ratio varies only in a ratio of 3 to 4. Consequently the fourth differences can be largely 
taken into account by a suitable modification of the second differences. It is shown by 
Comrief that if, instead of using £ 2 as it stands, we use # 2 — 0-184£ 4 , the resulting error in 

* Standard Four-Figure Mathematical Tables. 
t British Association Mathematical Tables, vol. 1, Introduction. 
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the interpolate will not exceed half a unit if S* itself does not exceed 1000 units. A ynmilfl.* 
device is extensively used in the British Association tables for modifying <5 4 to take partial 
account of d 6 and so on.® 

Many mathematical tables are now published with differences ready printed. 
First differences are given when the function will stand linear interpolation, but 
the usual arrangement of a difference table with the odd differences on intermediate 
lines gives rise to some trouble in printing if second and higher differences are given. 
Everett’s formula has the great advantage at this point that it uses only even 
differences, so that only these need be printed, and they lie on the same line as 
the datum value. 


9*041. Discussion of efficiency. The most convenient formulae to use with equal 
intervals in various circumstances are as follows, if the lowest difference neglected is to 
contribute less than half a unit in the last place. 


(1) Near the beginning or end of the table, where a centred second difference is not 
available at one end of the interval, there is no alternative to the Gregory formula. (2) If 
the third differences do not exceed 60 units in the last place and the second differences 
exceed 4 units, Bessel’s formula with mean second differences is far the most convenient. 
(3) If the third differences exceed 60 units but the fourth differences do not exceed 1000 
units Everett’s formula with the throw-back is adequate. (4) Larger fourth differences 
need explicit allowance for £ 4 and possibly higher differences; a full account is given in 
the introduction to the British Association Mathematical Tables, vol. 1. 

The conditions contemplated in the third case, and still more so in the fourth, arise 
for functions tabulated to a large number of figures. A prohibitive number of entries 
would then be needed to permit even Bessel interpolation, with second differences, 
and there is no alternative to using intervals so long that higher differences become 
necessary. In such cases it is quite possible for interpolation with fourth differences to 
give an answer correct to ten figures when linear interpolation will not give one correct 
to three. 

Mention should be made at this point of the use of Taylor’s theorem. As it depends only 
on the function and its derivatives at one datum value it can hardly be called inter¬ 
polation; but when the derivatives are known it will achieve a higher accuracy for a given 
interval with the same number of terms. In the divided difference formula knowledge of 
derivatives up to the wth at one datum value is equivalent to having n +1 data with their 
divided differences at that value, and the corresponding terms in the interpolation formula 
are those of Taylor’s series. There is no way of taking such information into account with 
equal intervals. 

In general ju,S 2n+1 f 0 is about f 2n+1) (x 0 )h 2n+1 , and for 6 small its coefficient in the 

(n\) 2 


Newton-Stirling series is about 6 


(n\f 


(2ra+l)f 


The whole term is therefore about 


jj 2 n+iflf( 2 n+i) The corresponding term in Taylor’s series is 


Jl2n+lfl2n+l 


(2n + 1)! '" \~o/- -*-wmxxx in «yrwi d xo - f-1)! f^ 2n+1> 

and is much smaller for | 6 | < 1 even for quite small values of n. Consequently if the 
derivatives are known there is no point in using any of the interpolation formulae; 
these are required when our only information about the function is derived from the 
tabular values themselves. 
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9*042. The following example illustrates the use of Bessel's formula. Given for 
z — 2,3, 4,5,6, infer values between x — 3 and x = 4. The difference table is as follows: 


x 

20 

30 

4- 0 

5- 0 
60 


4 * 

1-414 

1- 732 

2 - 000 
2-236 
2-449 


4 / 

+ 0-318 
+ 0-268 
+ 0-236 
+ 0-213 


A 8 / 2 fiS* 


-0060 

-0-032 

-0-023 


-0-082 

-0-066 


Inspection of the second differences shows that the third differences are under 20 in the 
last place; hence we can use Bessel’s formula with second differences. We first interpolate 
linearly at intervals of 0*1. We then multiply the double mean second difference —0*082 
(= —0*050 — 0*032 = 0*236 — 0*318) by the coefficients —£0(1 — 0) from the table above, 
and add to the linear interpolate (note that in all formulae the coefficient of the second 


difference is negative): 




Correct value 

3-1 

1-7588+ 0-0018= 1-7606 

1-761 

3-2 

1-7856 + 0-0033 = 1-7889 

1-789 

3-3 

1-8124 + 0-0043 = 1-8167 

1-817 

3-4 

1-8392 + 0-0049= 1-8441 

1-844 

3-5 

1-8660 + 0-0051 = 1-8711 

1-871 

3-6 

1-8928 + 0-0049 = 1-8977 

1-897 

3-7 

1-9196 + 0-0043 = 1-9239 

1-924 

3-8 

1-9464 + 0-0033= 1-9497 

1-949 

3-9 

1-9732 + 0-0018 = 1-9750 

1-975 


The error never exceeds 1 in the third figure. It is surprising at first sight that such good 
agreement should be possible when a constant second difference is used for interpolation, 
seeing that the second differences at the beginning and end of the range are nearly as 
3 to 2. But the mean second diff erence must give agreement at the beginning, middle, and 
end of each interval, and the errors never have a chance to accumulate. 


9*043. The following harder example illustrates the use of Everett's formula: 


X 

cot x° 

30 

1-7321 

35 

1-4281 

40 

1-1918 

45 

1-0000 

50 

0-8391 

55 

0-7002 

60 

0-5774 


4 / 


A 8 / A 8 / A 4 / S 2 f (modified) 


-0-3040 

-0-2363 

-0-1918 

-0-1609 

-0-1389 

-0-1228 


+ 0-0677 
+ 0-0446 
+ 0-0309 
+ 0-0220 
+ 0-0161 


-0-0232 

-0-0136 

-0-0089 

-0-0069 


+ 0-0096 
+ 0-0047 
+ 0-0030 


+ 0-0427 
+ 0-0300 
+ 0-0214 


The third differences forbid the use of Bessel’s formula to second differences only, but 
the fourth differences can be thrown back on the second for the use of Everett’s formula. 
Each is multiplied by 0*184 and subtracted from the second difference in the same line 
to give the modified second difference. This is then multiplied by the coefficients in 
Everett’s formula and combined with the linear interpolate: 


X 

cot x° 



Correct value 

41 

1-15344 - 0-00205 - 0-00096 

= 

1-1504 

1-1504 

42 

1 - 11508 - 0 - 00273 - 0-00168 

= 

1-1107 

1-1106 

43 

1-07672 - 0-00239 - 0-00192 

= 

1-0724 

1-0724 

44 

1-03836 - 0-00137 - 0-00144 

= 

1-0356 

1-0355 

45 

1-00000 

= 

1-0000 

1-0000 

46 

0-96782 - 0-00144 - 0-00068 

= 

0-9657 

0-9657 

47 

0-93564 - 0-00192 - 0-00120 

= 

0-9325 

0-9325 

48 

0 - 90346 - 0 - 00168 - 0-00137 

= 

0-9004 

0-9004 

49 

0-87128 - 0-00096 - 0-00103 

= 

0-8693 

0-8693 


There are only two differences of 1 in the last place in spite of the rather large higher 
differences. The effect of is well over a unit in places, but is adequately taken into 
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account by the throw-back with hardly any additional work. It will be noticed that one 
set of corrections is symmetrical about 45°. In long interpolations this halves the work 
with Everett’s formula; there is a corresponding simplification with the Newton-Stirling 
formula and also with the Newton-Bessel one. 

9*044. When a function is originally given at a number of irregularly and widely 
spaced values, as often happens when the determinations are from experiment, and a 
detailed table is wanted, the usual procedure would be: first interpolate by divided differ¬ 
ences to equal intervals such that on an average about two values lie between con¬ 
secutive datum values; then interpolate by Bessel’s or Everett’s formula to intervals 
such that linear interpolation is possible. It is a matter of convenience whether we proceed 
by stages in this way or do the whole interpolation by divided differences at once. Detailed 
tables of the Bessel and Everett coefficients at intervals 0-001 of 6 have been published by 
A. J. Thompson,* E. Chappell,f Comrie J and L. J. Briggs and A. N. Lowan.|| The above 
specimen values would usually suffice for an interpolation, but if only a few values are 
wanted the use of these tables will give them with only one rounding-off error instead of two. 
On the other hand, this difficulty can be greatly reduced by carrying out the preliminary 
interpolations to an extra figure. It is a customary requirement of mathematical tables 
that the last figure given should be correct to half a unit, and in using them it is often well 
worth while to keep an extra figure to reduce the accumulation of rounding-off errors if 
the work has to be done in several stages. A lot could be said for tolerating errors up to 3 
in the last place of published tables; a five-figure table with such errors is more accurate 
than a four-figure one with errors up to 0-5 in the last figure, and involves no more trouble 
in interpolation. Such a device is virtually used in the tables of Milne-Thomson and 
Comrie, which are printed to four decimals, but an upper dot is added at the end if the 
eorrection needed is between + ^ and + £ in the last figure, and a lower dot if it is between 
-£ and Thus if 0-0008* is read as 0-00083, and 0-0008. as 0-00077 the tables can be 
used as five-figure ones, and interpolation does not need the retention of more figures than 
are needed to prevent rounding-off errors in the last place with the usual four-figure ones. 

It is customary to round off to the nearest integer in the last place, not to the next 
integer below. This prevents all the rounding-off errors from having the same sign and 
accumulating in a sum. When the first figure neglected is 5, one usually takes the nearest 
even integer in the place kept. 

9*045. Interpolation when a derivative becomes infinite. It should be remem¬ 
bered that the limitation of accuracy in interpolation formulae imposed by the higher 
derivatives of the function is not trivial. This is obvious if the function is infinite at a 
point of the range, since differences that involve the value at that point are infinite and 
the whole method breaks down. But it is also serious if a derivative is infinite, as for, say, 
x r k at x = 0. We have the following difference table: 


X* 

A 

A 2 

A * 

A 4 

0-000 

1-000 

1-414 

1 - 732 

2 - 000 

+ 1-000 
+ 0-414 
+ 0-318 
+ 0-268 

- 0-586 

- 0-096 

- 0-050 

+ 0-490 
+ 0-046 

- 0-444 


* Tracts for Computers, 5, 1921; second edition 1944. f Published privately, 1929. 

J ‘Interpolation and Allied Tables’, from Nautical Almanac, 1937. 

|| Tables of Lagrangian interpolation coefficients, W.P.A., New York, 1944. 
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We try to interpolate a value for x = 0-5 by Gregory’s formula. We get 

0-5000 + — (- 0-586) + (0-490) + Uitz 2 ^~ ^ ? + ... 

= 0-5000 + 0-0732 + 0-0306 + 0-0172 + ... = 0-6210+.... 

The correct value is 0-7071, but we have not achieved a tolerable approach to it even with 
fourth differences. It is necessary to the success of the interpolation process that deri¬ 
vatives up to the order of the last difference retained shall exist throughout the range used 
in forming that difference. 

This difficulty can often be circumvented by a change of f(x). Thus though cot* 0 is 
infinite at * = 0, * cot x° has a definite limit there equal to 57-30, and its derivatives 
are finite. If cot* 0 is given for a set of values of * we can therefore interpolate *cot*° 
and then divide by *. Again, cosh -1 (1 +*) behaves like (2x) 1 ^ near * = 0; but we can 
interpolate its square and take the root afterwards. 


9*05. Inverse interpolation: solution of equations. This process is useful when 
we have values of a function for equal intervals of the argument and want to know for 
what value the function takes a given value.* Consider the equation 


/(*) = * 8 — 3* — 7 = 0. 

By inspection there is a root between + 2 and + 3. We can begin by calculating values for 
* = 2, 2-1, ..., 3-0 or by calculating for 2, 2-2, ..., 3-0 and then interpolating to the mid- 


points, where the third difference is irrelevant. 

We find 


X 

/(*) 

A f(x) 

A 2 /( x ) 

A *f(x) 

2-2 

2-3 

2-4 

2-6 

2-6 

2-7 

- 2-962 
- 1-733 
- 0-376 
+ 1-125 
+ 2-776 
+ 4-583 

+ 1-219 
+ 1-357 
+ 1-501 
+ 1-651 
+ 1-807 

+ 0-138 
+ 0-144 
+ 0-150 
+ 0-156 

+ 0-006 
+ 0-006 
+ 0-006 


The constancy of A 3 /(*) checks the arithmetic. The root is clearly about 2-425. We inter¬ 
polate by Everett’s formula for the highest accuracy and get 


X 


m 


&f(x) A */(*) 


2-40 

2-41 

2-42 

2-43 

2-44 

2-45 


-0-37600 

-0-2269-0-00410-0-00248 = -0-23248 
- 0-0758 - 0-00691 - 0-00480 = - 0-08761 
+ 0-0743-0-00857-0-00682 = +0-05891 
+ 0-2244 - 0-00922 - 0-00840 = + 0-20678 
+ 0-3745-0-00900-0-00938 = +0-35612 


+ 0-14352 
+ 0-14497 
+ 0-14642 
+ 0-14787 
+ 0-14934 


+ 0-00146 
+ 0-00145 
+ 0-00145 
+ 0-00147 


The third differences are now irrelevant, and the root is near 2-426. We interpolate by 
Bessel’s formula as follows: 

X /(*) 

2-425 -0-01430-0-00018 = -0-01448 

2-426 +0-00034-0-00017 = +0-00017 


Comrie, Inverse Interpolation. 
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The contributions from the second difference are nearly constant and we have 

x = 2-425 + 0*001 x = 2-4259884, 

0-01465 

with a possible error in the last figure. 

Alternatively, we could interpolate between 2-4 and 2*5 by using a mp>n.n second 
difference. The neglect of third differences might then make an error of 0-0001 in f(x) 
and an error of 0-000007 in x might be expected. For most purposes such accuracy would 
be ample. 

These methods are not restricted to algebraic equations. They would, for instance, 
enable us to construct a table of sin -1 x given a table of sin x. 

The following method is also useful for algebraic equations. In the above equation we put 

x = 2+x ± ; 

then -f(x) = 7 + 3(2 + x 1 )-(8 +12;^+ 6*1 + 3 ?) = 5 - 9x ± - 6zf - of = 0. 

Now put x 1 — 0-4 + x 2 , and the equation becomes 

+ 0-376- 14-28a; 2 —7-2#! —*1 = 0. 

Write this as 14-28a; 2 = + 0-376 - 7-2zf - 

and put x 2 = + 0-026 on the right, which is now + 0-3711152, giving a further approxima¬ 
tion x 2 = +0-02598846. Try next x 2 = +0-0259; the right side becomes +0-3711528, 
x 2 = +0-02599109. Linear interpolation gives for 

Xg = 0-02599, x'g = 0-02598872, 

Mid direct calculation x 2 — 0-02598872. This justifies linear interpolation; and finally 
interpolating to x 2 = 0-025989 we have x 2 = 0-02598875. Then 

x = 2-42598875. 

In principle this method® is the same as that usually known as Homer’s, but it is closer 
to one given by Newton. Horner’s contribution seems to have been the introduction of 
synthetic division, a useful device in its proper place, but experience of the method does 
not encourage the belief that the easiest way to add 3 x 4 is to add 4 in three separate 
operations. One great advantage of not multiplying the roots by 10 at each stage is that 
the coefficient of the first power of the unknown then varies little in the later stages, and 
it is easier to see what higher powers can be neglected consistently with the accuracy 
required.* 

9*06. Checking by differences. When a function is given at close intervals the higher 
differences decrease rapidly. When the values have been found independently the forma¬ 
tion of differences therefore gives an easy check on the arithmetic. They will not in general 
be zero, since the tabulated values will usually have errors up to 0-5 in the last place, with 
either sign. Consequently the error of A/ may reach a whole unit; while 

=fi~ 2/ 0 +/_i, = h~ 3 ft + 3/ 0 = f 2 - 4/ x + 6/ 0 -4/_, +/_ 2 

* Gf. also Jeffreys, Math. Oaz. 27, 1943, 20; L. J. Mordell, Nature, 119, 1927, 42; Jeffreys 
Nature , 119, 1927, 565. 
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and may reach 2, 4, and 8 units respectively. Thus we take a set of values from Bot- 
tomley’s table of natural logarithms: 


* 

7-00 

7-01 

7-02 

7-03 

7-04 

7-05 

7-06 

7-07 

7-08 

709 

7-10 


log* 

1-9459 

1-9473 

1-9488 

1-9502 

1-9516 

1-9530 

1-9544 

1-9559 

1-9573 

1-9587 

1-9601 


A/ A 2 /. A 8 / A 4 / 


+ 14 
+ 15 
+ 14 
+ 14 
+ 14 
+ 14 
+ 15 
+ 14 
+ 14 
+ 14 


+ 1 
-1 
0 
0 
0 

+ 1 
-1 
0 
0 


-2 
+ 1 
0 
0 

+ 1 
-2 
+ 1 
0 


+ 3 
-1 
0 

+ 1 
-3 
+ 3 
-1 


and the differences call for no comment. In interpolating such a table there is not only 
no gain but an appreciable loss of accuracy if any attempt is made to keep differences 
above the first. The rounding-off error may have the same sign at two consecutive entries, 
and nothing can reduce it; but if it has opposite signs at two consecutive entries the errors 
of the linear interpolates will be intermediate and on the whole smaller than those of the 
tabulated values. To see the effect of this let us round off the values at intervals of 0-02 


to the third figure and see what happens if we then try to keep a second difference and 


halve the interval. 


* 

log* 

A/ 

A 2 / 

7-00 

1-946 

+ 3 
+ 3 
+ 2 
+ 3 
+ 3 
+ 3 
+ 3 
+ 2 
+ 3 
+ 3 


7-02 

1-949 

0 

7-04 

1-952 

-1 

7-06 

1-954 

+1 

7-08 

1-957 

0 

7-10 

1-960 

0 

7-12 

1-963 

0 

7-14 

1-966 

-1 

7-16 

1-968 

+ 1 

7-18 

1-971 

0 

7-20 

1-974 



The linear interpolates to four figures, the Bessel corrections, and the correct values are 
as follows: 


* 

log * 


Correct 

Error 

* 

log* 


Correct 

Error 

7-01 

1-9475 

0 

1-9473 

+ 2 

7-11 

1-9615 

0 

1-9615 

0 

7-03 

1-9505 

+ 1 

1-9502 

+ 3 

7-13 

1-9645 

+ 1 

1-9643 

+ 2 

7-05 

1-9530 

0 

1-9530 

0 

7-15 

1-9670 

0 

1-9671 

-1 

7-07 

1-9555 

-1 

1-9559 

-4 

717 

1-9695 

-1 

1-9699 

-4 

7-09 

1-9585 

0 

1-9587 

-2 

7-19 

1-9725 

0 

1-9727 

-2 


The errors given are those of the last figure in the linear interpolate and never reach 5. 
The Bessel corrections are at most 0*6 in this place and trivial in any case; but in all four 
cases where they are not zero they increase the magnitude of the error. This is a general 
result and applies also to higher differences; the errors of interpolated values are on the 
whole a little less than those of the tabular values, but the gain in accuracy on inter¬ 
polation is reduced by taking account of nth differences less than 2 n ~ 1 in the last place.* 
This statement applies to interpolation between two neighbouring values. Many 
books of tables print mean differences over a whole line of the table. If these are used the 
tendency of adjacent errors to cancel disappears, and instead each interpolated value has 
three nearly independent errors: rounding of the datum values; rounding of the printed 
* R. A. Fisher and J. Wishart, Proc. Camb. Phil. Soc. 23, 1927, 912-21. 
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difference; and variation of the difference within a line of the table. Mean differences 
save some trouble if last figure errors do not matter, but should not be used if they do. 

Differencing will often show up a mistake in arithmetic immediately; thus suppose that 
we had miscopied an entry in the table of log x as follows: 


/ 

1-9459 

1-9473 

1-9488 

1-9502 

1-9515 

1-9530 

1-9544 


A / A •/ A 3 / A*/ 


+ 14 
+ 15 
+ 14 
+ 13 
+ 15 
+ 14 


+ 1 
-1 
-1 
+ 2 

-1 


-2 
0 
+ 3 
-3 


+ 2 
+ 3 
-6 


The large fourth difference, though just possible with only rounding-off errors present, 
picks out the incorrect value of /(*). 


9*07. Differentiation. Newton’s and Gregory’s interpolation formulae can be differ¬ 
entiated at once; we have 




I.X i#2] "1" (,X± #2) [^1 + • •., 


, d 2 ( 9—1 302 _ 6 0+2 

h r J(x) = Ay 0 + -^ r A^y 0 + v - J] -AV 0 +.... 


( 1 ) 

( 2 ) 


The former is not often used because it yields derivatives only at the tabular values of 
the argument, ff they are wanted for intermediate values it is easier to interpolate the 
function to equal intervals by divided differences and apply one of the rules for equal 
intervals to the interpolate. The most useful form for equal intervals is got by differ¬ 
entiating the Newton-Stirling formula, for in this all even differences are multiplied by 6* 
and give terms in the derivatives that vanish for the tabular values of the argument. 
We have 


h 




— g j jtffo • ■ 


,+ 


(~) n W 

(2n+l)l 


M 2 n +i/o+.... 


( 3 ) 


This is, of course, subject to the same hmitation as applies to the central difference formulae 
for interpolation, that it cannot be used near the beginning and end of the table. 

Since the tabular errors in the function will usually be of the same order of magnitude 
for all values of x, the accuracy of the right side of this formula is independent of the 
interval, but the left contains a factor h. Consequently accuracy can be increased by using 
a large interval, even though it may make higher differences important. Thus consider 
sin* at * = 1-0, with * in radians. The table of Milne-Thomson and Comrie is at intervals 
of 0-001, and the first differences above and below 1-000 are +0-0006 and +0-0005. All 
that we could say from this is that the derivative is likely to be between 0-5 and 0-6. If 
we use intervals of 0-01 instead we have 


x sin * A/ A 8 / A 8 / fi8 fid* 

0-980 0-8305 

0-990 0-8360 Jll 0 

1-000 0-8415 + -2 0-0054 0-0000 

1-010 0-8468 0 +2 

1-020 0-8521 +6d 

Then 0-01/' = 0-0054, f' = 0-54. But /id may be wrong by half a unit in the last place 
and we can say only that 0-535 ^/' < 0-545. 
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Now try intervals of 0-1: 


* sin as A/ A*/ A 8 / 


0-8 

0-9 

1-0 

11 

1-2 


0-7174 

0-7833 

0-8415 

0-8912 

0-9320 


+ 659 
+ 582 
+ 497 
+ 408 


-77 

-85 

-89 


-8 

-4 


fid fiS* 


+ 539-5 -6 


0-1/' = 0*05395+ 0-00010,/' = 0*5405 with a possible error of 0*0005. The correct value 
is 0*5403. 

Second derivatives can be found by differentiating the Newton-Stirling formula twice 
and then putting Q = 0. We have 

n-h**'* -+ ( ~ )n ~S~ 1)y M> y» + •••* < 4) 

The remarks above on the need for a wide interval of course apply even more forcibly to 
this formula. 


9*08. Integration. The simplest integration formula and one of the most useful is 
the Euler-Maclaurin formula. If/(a;) is differentiable we have by integration by parts 

f f(x)dx = [xf(x)~\ -f xf'(x)dx = i/(0) + £/(l)- f {x-l)f'{x)dx. (1) 

Jo LJoJo Jo 

The first two terms give the ‘trapezoidal’ rule for integration; the integral expresses a 

correction to it. Now for 0 ^ x < 1 and r 3* 2 define 

P r ( 0) =0, 

P' 2 {x) = x-\, 

P’ z {x) = b 2 +P 2 (x), 

P' i {x) = b z + P z {x), 

P'rix) = b^ + P^x),, 
and choose the b r so that P r (l) also equals 0 for all r ^ 2. 

Then P r (x) = {b r _ t + P r _x(^)} d£, 

b r = — J P r (x)dx. 

Then by successive integrations by parts 

rp' % (x)f(x)dx =rp,(*)/'(*)j*-j‘p,(*)/'(a!)«fc 
= - f {P's( x ) - *z }/"(*) dx 

= ».{/'(!)-/'(0»-[^.(*)/'(*)]‘ +J’p,(*)/*(*)<fe 

= * 2 {/'{l) -/'(0)> - i> 3 {/"(l) -/"(0)}+ ... 

+(■- r - (- rjl <&. < 5 > 

since all vanish at both limits. 


( 2 > 


(3) 

(4) 
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Now consider the series 


— r- + £a = £ b r a r , 

- 1 r = 0 

gOi _ J 00 

r = SP r (Oa r . 

e —i r= 0 


(6) 


( 7 ) 


We shall show that the b r and P r (t) defined in this way are identical with those given by 
the above definitions (2). Differentiate (7) with regard to t\ then 

nZpcU co 

?n = 

e — x r= 0 

and also = £ PJt)a r + 1 + 

r-0 fW e a —l 

= ZP r -i(t)a r + i:b r _ 1 a'-ia*. 

r-1 r-1 

Equating coefficients of a r we have for r ^ 2 

P'r(t) = P r -l(t) + b r - 1 . 

For r = 0,1,2 we expand and get 

~ 1 + ^ 2) = af +M< 2 -«) + - = i > 0 (f) + aP 1 (<) + ay>(«) + 
whence P 0 (t) = 0, P x (t) = t, P 2 (t) = \(t*-t), 

and 


( 8 ) 


( 9 ) 

( 10 ) 


P 2 (t) — t—\. 

Also if t = 0, the function on the left of (7) vanishes; hence all P r (0) = 0. lit = 1 the func¬ 
tion reduces to a; hence P r (l) = 0 except for P^l), which is 1. This proves that for 2 
the functions defined by (2) and (6) and (7) are identical. 

In (6) change a to —a; we have 


e~° — 1 i“-e“-I~^ -a + iS^l - ^ a = i53T + *“- 

Hence (6) is an even function of a, and all b r with r odd are zero. Then (5) simplifies to 

- J‘ {x-M(x)dx = -6,{/'(l)-/'(0)}-6 4 {/'(l)-/*(0)}-... 

( 11 ) 

Integrating the remainder term by parts we have 

in which the integrated part vanishes. 

Ifwe now applythis result to the intervals 0 to 1, lto2, 1 to wand add, we have, from 

(1) and (11) thePwZer- Maclaurin formula (the method is essentially due to W. Wirtinger,*) 

dx = |/(0) +/(1) +/(2) +... +f(n — 1) + \f{n) 

~ Win) -/'( 0 )} - & 4 {/» -f" ( 0 )} -... - b 2r {f(*r-V(n) -^-D( 0 )} 

n— 1 pm+l 

- S “ ■ 

m=0J m 


P 2 r +i(^ - m)P r+1 \x) dx. 

* Acta Math . 26 , 1902 , 266 - 60 . 


(12) 
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In its usual form this formula is stated in terms of the Bernoulli numbers* and 
polynomials, B r , <f> r {x) defined by 

b r = B r /r\, P r (x) = <f> r (x)/r !. (13) 

The easiest way of calculating them is by successive applications of the relations (2). 
The introduction of the factorials reduces the accumulation of large denominators in 
the successive integrations, but slightly complicates the proof of the above theorem. 

Bq = 1, B% = f, B 4 = — YO, Bq — + J2 - , Bg = Yq, B 1q = + e“6, (1^) 

<f> 2 {x) = x 2 — x, <f> s {x) = x 3 — fa; 2 + fa;, 0 4 (a;) — x* — 2a; 3 + a; 2 , 

<f> & {x) = x 5 — fa; 4 + fa; 3 — fa;, <p 6 (x) = a; 6 — 3a; 5 + fa; 4 — fa; 2 , 

</>rj(x) = x 1 - fa; 6 + fa; 5 - fa; 3 + fa;, <f> 2 {x) = x 8 - 4a; 7 + ^x 6 - fa: 4 + fa; 2 , 

= a; 9 - fa; 8 + 6a; 7 - Qx 5 + 2a; 3 - &x, <f> 1Q (x) = a; 10 - 5a; 9 + ^x* - 7a; 6 + 5a; 4 - fa; 2 . 

Changing x to x Q + 6h, we have 

fxt+nh 

f(x) dx = h[%f(x 0 ) +f(x 0 + h) + ... +f{x 0 + (n-l)h} + %f(x 0 + nh)] 

J 

n—1 

- s P 2r+1 (6-m)F»+»(x o + Bh)d0. (16) 

m— 0 J m 

Now in (7) put t = f. We have 

gVao — 1 

°W =sp ^ )o '- 

Change a to —a and subtract; we have 

o = 22 P r (l)o% 

odd r 

and therefore P x (f) = f, P 2r+ i(f) = 0, (2r +1 > 1). 

We shall prove that P 2r+1 (Z) has no other zeros than 0, f, and 1 for 0 ^ < 1; and P 2r (t) 
has none but 0 and 1. 

Suppose that P 2 r-i(0 has simple zeros at t = 0, f and 1, and at no other value. Take it 
positive for 0 < t < f. Then by (8) 

P'M - P M (0 >° (0<<<f), 

<0 (f<f<l), 

and p 2 r(°) — P*r( l ) = 

Hence P 2r has one maximum at t = f and no other stationary value, and in 0 < t < 1 it 
has the sign of P 2r _ 1 (e), where by e we mean some number between 0 and f. 

Next, P 2r+ x{t) = PM + K- 

* At least three different definitions of B r are current; this one is that used by Milne-Thomson 
except for B lt and is far the most convenient. His B r (x) is the present (f> r (x) + B r . 

The formula was first given by Euler and rediscovered independently by Maclaurin a few years 
later. Continental authors usually refer to it as the Euler formula, but there are many Euler 
formulae and only one Euler-Maclaurin formula. 
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But as Kr has only one zero m0<t<l, + b 2r can vanish for at most two values of t, 

and P 2r +i(t) can have at most three zeros in 0 ^ t < 1. Hence t = 0, 1 are the only zeros 

of P 2r+ i(t). Also P 2 r(i) must be numerically greater than — & 2r , and therefore the zero 
of P 2r +i(t) at t — £ is simple. 

Also since P 2r (£) does not change sign between 0 and 1, P 2r (£) has the opposite sign 
to bfr, by (4). Since P 2r (0) = 0, P 2r+1 (e) and therefore P 2r+1 (e) have the same sign as 6^, 
which is the opposite sign to Ar and therefore the opposite sign to _ x (e). Hence 
Par+i( e ) alternate in sign with r. It follows that the b 2r alternate in sign. 

The sign of a typical remainder term in (12) is that of 


fm+l 

- Par+ i( x - m ) {/ (2r+1) (*) -f (2r+1) (m + £)} dx, 

J m 


and if f® r+1) (x) is monotonic in the interval, we have, since P 2r+1 0») has opposite signs for 
0 < x < \ and \<x<\, that the sign is that of — P 2r+1 (e) {/ (2r+1) (m) —/ 2r+1 (w +$)}. In an im¬ 
portant class of cases all the odd derivatives have the same sign. Hence, since the P 2r+1 (e) 
alternate in sign, the errors due to stopping at a given value of r alternate in sign. Hence 
the true value of the integral always lies between the sums of r and r +1 terms of the series. 
The condition will be seen to be satisfied in the examples that follow. 

The expansion can be derived operationally as follows. From (1) 


fx«+(m+l)fc 

J 

Put 


f(x)dx = h[lf(x 0 + mh) + $f{x 0 + (m+ l)h}]-h 2 ^(d~i)f'(x 0 +mh + dh)dd. (17) 

k =D - (18) 

Then the last term is — A a J — De im+0)hD f(x o ) dO, (19) 


in which the operator D is independent of 6 and therefore commutes with all functions 
of 0. Hence we can integrate with regard to d as if D was a constant. Then we get on 
integration by parts 


otnhD 




Also 


i*(e“+l ) + I(e«>-!)]/(*„) = e^[-A+(I-l&)(e^-l)]/(z,). (20) 

( 21 ) 


n —1 pnhD 1 

2 e mhD = 

m=» 0 


e hD - 1 9 

and therefore the sum with regard to m of the last terms in (17) is 
(e nhD -l)l hD 


( hi1 \ pnhD _ 1 oo T> 

D - 

—^§{/>o+«A)-rw}-^§{/>o+»A)-rw}--- (22) 


We thus obtain the Euler-Maclaurin expansion again but without a form for the remainder 
after a given number of terms. 

The expansion should not be interpreted as an infinite series. The theorem of 9-012 
fixes an upper bound to the remainder term in interpolation formulae, and the integral 
of f(x) is the sum of the integrals of the interpolation polynomial and this remainder term. 
If the remainder is small throughout the range the integral of the interpolation poly¬ 
nomial is within specifiable limits an approximation to that of the function. But the 
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derivatives of the polynomial vanish exactly above a certain finite order, and the expan¬ 
sions are properly interpreted as the sum of a finite number of terms. The justification 
of the operational method in this problem therefore has nothing to do with convergence 
of series. It rests on the facts that (1) the operators are expansible in positive integr al 
powers of D and therefore the terms after a certain order vanish when the operand is a 
polynomial, (2) since negative powers of D do not arise, the non-commutative property 
of differentiation and definite integration does not matter, (3) the error is the integral of 
the error of the interpolation polynomial, and is fixed for any finite order irrespective of 
questions of convergence. 

As a matter of fact what usually happens is that the terms of the expansion decrease 
rapidly at first but afterwards increase, on account of the tendency of higher derivatives 
to increase if the function is not a polynomial. The most accurate value of the integral is 
then got by taking the sum up to the smallest term. We shall return to this matter when 
we come to asymptotic expansions, of which this is an example. 


9*081. Consider the integral 


We have 
and 


i o H° dx 
lo g 2 = I —. 

J 10 * 


log 2 = 


f(z) = 

1 

/'(*) = ‘ 

1 

~aT 2 ’ 

rw = - 

2.3 
z 4 ’ 

/ (5, (*) = 

5! 

s 6 ’ 

1 

4-k 

1 

-u 


1 

BJ 1 

. 1 \ 

BJ 1 

1 \ 

+ H + 

12 + 

... + 19 + 

2.20 

2 \l0 2 

20 2 / 

4 \10 4 

20 4 / 


We arrange the calculation as follows: 


0*0500000000 

0*0909090909 

00833333333 

0*0769230769 

0*0714285714 

0*0666666667 

0*0625000000 

0*0588235294 

0*0565555556 

0*0526315789 

0*0250000000 

0*6937714031 


-^(0*01-0*0025) = -0*000625 
+ riiKO’OOOl - 0*00000625) = + 0*0000008333 - 0*0000000521 
-*52(0-000001) (1-^) = -0*0000000039 

+ ***(0-00000001) (l-*£ff) = +0-0000000000 

Total = -0*0006242227 
Hence log 2 = + 0*6931471804 

The correct result is 0*6931471805.... 


9*082. Consider next Euler’s constant y defined by 

7= . u ?.( i+ l + I + - + i- i H- 

We have 

log n — log 10 = f — 

J10 * 

= ± i i. +_L + J___L/J__L\. J_/J_ M i i 

20 11 12 n— 1 2n 12\10 2 w 2 / 120\10 4 n 4 / 252'lO 6 ^'*'* 

and 

n h iL(n + ^ + - + ^- lo «” + lo « 10 ) = -^ + 1^0—I2no5 + 250o5= -0-049167496. 
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Gregory's integration formula 

l + 1+...+j^-loglO = 0-626383161 
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by direct summation. Hence by addition 


y = 0-577215665. 

This is correct to nine figures. It will be noticed that the extension from three- to nine- 
figure accuracy requires the computation of only two extra terms.* 

9*083. Gregory’s integration formula. The Euler-Maclaurin formula is the 
simplest and most accurate formula of numerical integration, but requires that direct 
calculation of the derivatives shall be possible. Higher derivatives may be mathematically 
complicated and it may be easier to replace them by differences. This can be done either 
by a formula due to Gregory or by a central difference formula. 

We have seen that the correcting terms in the Euler-Maclaurin expansion can be 
expressed as in 9-08 (22). We also have at the beginning and end of the range respectively 


hD = log(l + A), hD = — log(l — V), 


( 1 ) 


+ - 


e hD — 1 D 


h h u h h i t 

,2A = _ A + log(l + A) _s * = _ V _ log(l-V) + 5ft - 


(2) 


Two procedures are possible. We can express D in powers of A and V and substitute in 
the Euler-Maclaurin formula; or we can expand the operators in (2) directly without 
appealing to previous knowledge of the Bernoulli numbers. Both involve rather heavy 
algebra. A direct attack (Gregory again!) seems easier than either method. Developing 
the Gregory interpolation function in powers of 0 we have 


0(0-1) = 0 2 -0, 0(0-1) (0 - 2) = 0 3 - 30 2 + 20, 

0(0-1) (0-2) (0-3) = 04-60 3 +1102 - 60, 

0(0-1) (0-2) (0-3) (0-4) = 0 5 -1O0 4 + 350 3 - 5002 + 240, 

0(0-1)... (0-5) = 0 6 -150 s + 8504 - 22503 + 27402-1200, 

0(0- 1)... (0 - 6) = 0 7 — 210 6 + 1750 5 - 73504+ 16240 3 -17640 2 + 7200, 

0(0- 1)... (0- 7) = 0 8 - 280 7 + 3220«- 196O0 B + 676904- 1313202+ 13O680 2 - 50400, 
and the respective integrals from 0 to 1 are 


Hence 


_ 1 4.1 

6> ^ 4, 


19 

30> 


A.9 
* 4’ 


863 i 1375 3395 3 

84 * •* 24 * 90 * 


l /•*«+* 1 1 19 

ij f(x)dx =/W + lA/K)-6^rA 2 /( a:o ) + i:IT A 2 /(x 0 )-g OI Ay(x 0 ) 


9 , 863 . 1375.,., , 33953 ..., . 

+ 4^ i/W 'M 4/W+ M 4/W 'wl! AW "^ (3) 


* These and other fundamental numerical constants were computed by J. C. Adams to 272 
figures (Collected Scientific Papers 1, 459—470). 
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The first two terms are ${f(x 0 )+f(x 0 + h)}. On adding for intervals up to x Q + nh we 
have therefore 


if. 


x t +nA 


f(x) dx — 2 /(^ 0 ) +f( x 0 + ^) +... +f{n 0 + (n — 1) h) + \f{x 0 + nh) 

1 .. 1 


6 . 2 ! 


{ A f( x o + nh) - A f(x 0 )} + {A 2 f(x 0 + nh) - A 2 /(z 0 )} +.... (4) 


But we could equally well work in powers of V, and the expansion will be the same as in 
terms of A except that the signs of all even powers will be reversed. The point again is 
that 9-08 (22) is exact as applied to the interpolation polynomial. The relation between V 
and A given in (2) is also exact as applied to this polynomial. Hence we can replace the 
terms in A r (x 0 + nh) by the equivalent expression in V r (x 0 + nh). The argument does not 
assume that f(x) is determinate beyond x 0 + nh, merely that the interpolation polynomial 
is, and this is true. Hence 




x t +nh 

x . 


f{x) dx = t/(x 0 ) +/(x 0 + h)+... +f{x 0 + (n-\)h} + $f(x Q + nh) 

~ ts( v /(* 0 + nh) - A f(x Q )} - 2 i{V 2 f(x 0 + nh) + A 2 f(x 0 )} 

- jin{V 3 f( x o + nh) - A z f(x 0 )} - rfo{V 4 /(a; 0 + nh) + A 4 /(x 0 )} 

“ mlh{^ 6 f( x o + nh) - A 5 f(x 0 )} - 2 Bhi^ 6 f( x o + nh) + A«f(x 0 )} 

~ ttM§bs{y y f( x o + nh) — A 7 f(x 0 )}.... (5) 

This is the Gregory formula. 

9*084. Central difference formula. Similarly we can integrate the Newton-Bessel 
formula. Here all the terms involving odd differences give 0 on integration; the others give 


ij. 




f{x)dx = hf( x o) + if(x 0 + h) 


1 iWh + +.... ( 6 ) 


6.2!' J ' n ' 30.4!' J * ,a 84.6!^ J ' n ' 90.8! 

Wfa = PVi+JVo) = 

= 2{i8 2r ~ 1 f 1 — 2/i8 2r ~ 1 f 0 . 


( 7 ) 


But 

Hence 

1 rxt+nh 

hj ^ x ~ U( x o) +/(*o + h) +... +f{x 0 + (n — 1) h} + \f{x Q + nh) 

~Ts{^f( x o + nh) - /i8f (*„)} + rM^fiXo + nh) ~/i8 3 f(x 0 )} 

- + nh) -/i8 5 f(x 0 )} + -s iiii oo{/i8 7 f(x 0 + nh) -fi8 7 f(x 0 )}. (8) 

Fewer terms have to be calculated with this formula than with the Gregory one, and the 
coefficients of the higher ones are smaller. On the other hand, the formation of the central 
differences requires knowledge of the function outside the range of integration, whereas 
the Gregory formula does not. 





X 


0 

005 

010 

015 

0-20 

0-25 

0-30 

0-35 

0-40 

0-45 

0-50 

0-65 

0-60 

0-65 

0-70 


9*085 Comparison of integration formulae 

9*085. As an illustration take the integral 

dx 


Mi 
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The square roots to 8 figures were taken from Barlow’s tables. 


(1 —X 2 )- 1 ^ 

1-00000000 

1-00125235 

1-00503781 

1-01144347 

1-02062072 

1-03279556 

1-04828484 

1-06752103 

1-09108945 

1-11978502 

1-15470054 

1-19736869 

1-25000000 

1-31590339 

1-40028009 


A 

0-00125235 

0-00378546 

0-00640566 

0-00917725 

0-01217484 

0-01548928 

0-01923619 

0-02356842 

002869557 

0-03491552 

0-04266815 

0-05263131 

0-06590339 

0-08437670 


A* 

0-00250470 

0-00253311 

0-00262020 

0-00277159 

0-00299759 

0-00331444 

0-00374691 

0-00433223 

0-00512715 

0-00621995 

0-00775263 

0-00996316 

0-01327208 

0-01847331 


A 8 

0-00002841 

0-00008709 

0-00015139 

0-00022600 

0-00031685 

0-00043247 

0-00058532 

0-00079492 

0-00109280 

0-00153268 

0-00221053 

0-00330892 

0-00520123 


A* 

0-00005682 

0-00005868 

0-00006430 

0-00007461 

0-00009085 

0-00011562 

0-00015285 

0-00020960 

0-00029788 

0-00043988 

0-00067785 

0-00109839 

0-00189231 


A 8 

0-00000186 

0-00000562 

0-00001031 

0-00001624 

0-00002477 

0-00003723 

0-00005675 

0-00008828 

0-00014200 

0-00023797 

0-00042054 

0-00079392 


A* 

0-00000372 

0-00000376 

0-00000469 

0-00000593 

0-00000853 

0-00001246 

0-00001952 

0-00003153 

0-00005372 

0-00009597 

0-00018257 

0-00037338 


A 7 

0-00000260 

0-00001201 

0-00008660 

0-00019081 


We have 

i/(0)+/(0*05) + ...+/(0*46) + i/(0*50) = 10-47518052. 


Using the central difference formula we see that all odd differences vanish at x = 0; and 
the odd differences give at 0-50 

2 fid = 0-07758367, 2fi8 3 = 0-00374321, 2/i8 5 = 0-00065851, 2/iS 1 = 0-00027741. 


Then the correction terms are 


" U 2/tf) + tHo (2M 3 ) - ifibuP/*) + W™o(2M 7 ) 

= - 0-00323265 + 0-00002859 - 0-00000104 + 0-00000010 
= -0-00320500. 


Then \tt = 0-05(10-47518052 - 0-00320500) = 0-5235987760, 

7r = 3-1415926560. 


The correct value is 


tt = 3-141592654. 


Using the Gregory formula we find the following correction terms: 


A 

-0-00280526 

A 2 

- 36471 

A 8 

- 2654 

A 4 

- 679 

A 5 

111 

A® 

43 

A 7 

9 

-0-00320493 


and n = 3-141592677. This is inferior in accuracy to the last; the difference of 7 units in 
the last figure of the sum might just be due to rounding-off errors but is not likely to be. 
We see that the last pair of terms in the central difference formula differ by a factor of 
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Simpson's rule 

about 10; so do the last pair from odd differences in the Gregory formula. We might then 
expect that A 9 would contribute about — 1 in the last place. But then we should also 
expect A 8 to contribute about — 4 in this place. It is probable therefore that the difference 
arises because the terms in the Gregory formula decrease more slowly than those of the 
central difference formula. 

This example is rather favourable to the Gregory formula because of the infinity of 
the function at x — 1. In finding centred differences we have to use values of the function 
up to 0-70, and the higher centred differences are correspondingly larger than those 
within the range of integration. If we had used instead the expression 


\tt = 


11 dx 
0 1 + 3 2 ’ 


the contrast would have been more striking.* 


9*09. Special rules of integration. Let f(x) be given for x = — h, 0, and h; then by 
Lagrange’s formula the interpolation quadratic is 


and J j(x)dx = P{/( — h) + 4/(0) +f(h)}. 

This leads to Simpson’s rule.\ Divide the range into a number of equal intervals; take the 
sum of the end ordinates, add four times the sum of the ordinates at the middles of the intervals 
and twice the sum of the ordinates at the junctions of the intervals, and multiply by a sixth of 
the interval . It amounts to fitting a quadratic to the three values of f(x) at the beginning, 
middle, and end of each interval. No attempt is made to maintain smoothness at the 
junctions between the quadratics. 

The possible presence of a cubic term does not affect the rule. For the cubic could differ 
from the quadratic only by a function that vanishes at —h, 0 and h, and such a function 
must be of the form Ax(h 2 — x 2 ). But the integral of this from — h to h is 0. 

Next, suppose f{x) is given for x = — 3 h, — h, h, 3 h, that is, for four equally spaced 
values. These determine an interpolation cubic 


_/_v _ oT,x (s + ft)(3-ft)(3-3ft) , (tx+3h)(x-h)(x-Sh) 

9\ x ) —/( 3A) / n T.\ / at,\ / ci,\ ”K/v to) 


+f(h) 


(-2h)(-4h)(-6h) 
(x 4- 3 h) (x + h)(x — 3 h) 


4h.2h.{ — 2 h) 


'+f(Sh) 


2h{-2h){-4h) 

(x 4- 3h) (x 4 -h)(x — h) 
6h Ah. 2h 


f(-Sh)(x* 3x 2 x \ f(-h)/x 3 

48 [h* h 2 h* y 16 \ft 3 ' 

f(h)(x*x 2 9x \f(3h)(x*3x 2 
16 U 3 + /> 2 h J 48 U 3+ h 2 


x 2 9x ' 
vs— t-4-9 
h 2 h J 


--3 
h 6 \ 


* This integral is worked out by Whittaker and Robinson, Calculus of Observations, pp. 147—9, 
third differences being used in the central difference formula; there is an error of 2 in the seventh 
decimal of tt. Ten intervals were used as here and seven decimals in the integrand. 

f Simpson’s publication was in 1743. It had been given earlier by Cavalieri (1639) and James 
Gregory (1668). 
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Three-eighths rule 

f 3A h h 

and \^g{x)dx = -^/(-3&)(-2.3 3 + 3.6) + —/(-ft)(-§.3 8 + 9.6) 

-~M)a.3*-9.6) + ^f(3h)(2.3°-3.6) 

= |A{/( — 3h) + 3f( — h) + 3f(h) +/(3&)}. 


Comparing with Simpson’s rule, if we call the length of the range H in both cases, this 
rule is 


rv 2 *r 
J -Van 


?(*) = iff {/( - iff) + 3/( - iff) + 3/(iff) +/(iff)}, 


and is called the three-eighths rule. Simpson’s rule for the same range is 


rVaH 
J -Van 


g(x)dx = iff {/(-iff)+ 4/(0) +/(iff)}. 


Note that if /(a;) is constant both rules give Hf(x), as they should; this is a help towards 
remembering the numerical coefficients. The three-eighths rule is due to Cotes. Both are 
correct up to cubic terms. Let us see how they work for a fourth power. Take £1 = 2, 
f(x) = x 4 . The correct value is then 

f - 0-4. 


Simpson’s rule gives 
The three-eighths rule gives 


$(1 + 0+1) = 0-667. 

$(1 +^r+^V+ !) = $$ = 0-519. 


Thus both give results in excess of the true value, the three-eighths rule being the better* 
But both will be inferior to the rules based on interpolation formulae, which take the 
fourth and higher powers into account explicitly. 

As an example, we apply Simpson’s rule to the data of 9-085. They give 

\i t = 0-523599265, 

which is in error by 0-00000049. The tf 8 term in the central difference formula contributed 
0-05 x 0-00002859 = 0-00000143, the $ 5 term 0-00000005. Hence the error given by 
Simpson’s rule is about a third of the 8 s term in the central difference formula and about 
ten times the 8 B term. A quartic term would be integrated exactly by the central difference 
formula including 8 s , but less accurately by Simpson’s rule. 

The merits of Simpson’s and the three-eighths rules are that they are simple and 
easily remembered; but they are less accurate than either difference formula up to 8 3 if 
the number of intervals used is the same. They do not need the formation of differences, 
but differences should be formed in any case as a check on the calculation oif(x), whether 
they are used for calculation or not. If saving of labour is a consideration it is better to 
save it by computing just enough values of f(x) to give a good check by differences. The 
use of the elementary rules should be restricted to cases where there really are only three 
or four determined values of the function and we have to do our best with them. 

A number of more complicated rules, the best-known of which is Weddle’s, take partial 
account of higher differences; but it is not often convenient to divide the range up into 
an integral multiple of the number of intervals that they require. For the same reason 
they fail if we require the integral up to every tabular value of the argument. Some other 




288 Deferred approach to limit 9*091-9*092 

rules, one of them due to Gauss, attempt to economize labour by choosing the datum 
points so that the integral based on them will be independent of powers higher than the 
number of datum points. But the trouble of interpolating the function to the suggested 
datum points is far greater than that of using a formula based on equal intervals, even if 
they have to be more numerous.^ 

9*091. L. F. Richardson’s method. One device for taking approximate account 
of second differences is due to L. F. Richardson, and sometimes gives an answer of the 
same order of accuracy as Simpson’s rule with very little trouble. It is based on the 
principle that if a result is of the form A + B/n 2 +..., we can get an approximate value of 
A by using two finite values of n and extrapolating to n = oo. For instance, 


tt — lim n sin 


and nsin- is of the form 7t + ~ + — a + .... 
n n 2 w 4 

Take n = 4 and 6; then neglecting G, 


B 4 

' r + F6=72 = 2 ' 828 ’ 


B 

*+36 


= 3000. 


Solving for tt we have tt = 3-138, 

which is a good return for this amount of work. Or take 


log 2 = 


2 dx 
x ' 


With one interval this gives by the trapezoidal rule 

A + B = $(l + $) = 0-75. 

With another ordinate at x = 1*5, and therefore two intervals, 

A + \B = i(l+t+£) = 0-7083; 
whence log 2 = A = 0-6944. 

This method is useful for estimating limits of many different types, and is not confined 
to integration. An extension to take account of terms in w -4 is also possible, and would 
give integrals equivalent to those containing third differences at the termini without the 
need to remember the coefficients in the formulae. The method is called by Richardson 
‘the deferred approach to the limit.’ 

9*092. Functions that behave like x^ 1 ^ at a terminus. All the usual formulae 
for numerical integration fail when the integrand behaves like x~V* at a terminus; yet the 
integral converges. They are also unsatisfactory when it behaves like # 1/a . Thus 


i: 


x^dx = f V 2 = F8856, 
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Functions like x ±1,a at terminus 


while Simpson’s rule applied to x = 0,1,2 gives 1*8047. Similarly, 

J*x*dx = 2^3 = 3*4641, 

and the three-eighths rule applied to 0, 1, 2, 3 gives 3*3655. These rules will therefore 
underestimate the contributions from the intervals near the ends by 3 or 4 per cent. It 
is easy, however, to obtain more accurate formulae where the integrand is known to behave 
like x~ 1/a or x 1 !* at the end of the range, using values at x = h and 2 h or 3 h as the data. The 
Gregory formula can then be used for the remainder of the range. Writing suffixes to 
indicate the arguments used, we find* 


/•2ft 

J^ (axr^+fix^dx = h( fV2</!-fy 2 ) = ^(3*7712^-1*3333^), 
rzh 

j (ax -1 !* + fix 11 *) dx = 2h*j3 y 1 + 0y 3 = 3*4641 b,y x + 0*0000y 3 , 

/• 3ft 

J ^ (ax- 1 !* + fix'k + yx *k) dx = h(^3 y 1 -^Qy 2 + ^-y z ) 

= ^(4*8497^- 3*9192y 2 + 2*4000y 3 ), 

/•2ft 

(ax 11 * + fix 31 *)dx = h(^2y 1+ £ry 2 ) = ^(1-5085^ + 0*2667y 2 ), 
rzh 

J ^ (ax 1 !* + fix 8 !*) dx = h( fV3 y x + f y 3 ) = h(2-0185 yi + 0*8000y 3 ), 
rzh 

J ^ (ax 1 !* + fix 8 !* + yx 6 !*) dx = h(%j3 y x + ^6 y 2 + ££ y 3 ) 

As an example let us evaluate 


= A(l*4846y 1 + 0*8398y 2 + 0*457ly 3 ). 


_ f 1 

n jwc 1 -* 2 )' 


using only two and three intervals. (1) gives with h = J 

x (l—x 2 )- 1 !* 

I 1*15470 - l*3333y 2 = -1*53950 

| 1*51186 3*7712 y 1 = 5*70153 

4*16197 

Hence n = 3 x \ x 4*16197 = 3*1215. 

(3) gives with h = £ 

x ( l—x 2 )~ 1/a 

| 1*1547 x 2*4000 =+2*7713 

| 1*3416 x - 3*9192 = - 5*2580 

£ 1*8091 x 4*8497 = +8*7736 

6*2869 


( 1 ) 

(2) 

(3) 

(4) 

(5) 

( 6 ) 


Hence ir — 3x|x 6*2869 = 3*1434. 

Thus we have an error of only 0*06 per cent from the use of only three intervals. A much 
closer estimate could be obtained by using the Gregory formula at intervals of 0*05 up 
* An extension is due to Bickley (Adm. Res. Ctee., Paper 9211, 1946). 
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to x - 0*85 and adding the integral from 0-85 to 1*00 obtained by means of (3). This 

method has been extensively used in calculations in seismology, where the integrand is 

r° C a 

of the form cosh" 1 (1 + £)dx and £ = ax + 0(x 2 ) for x small, or I £“ 1/a dx, where £ has 

the same property. 

9*093. Graphical methods are best avoided entirely. A rough sketch is useful to exhibit 
the general appearance of a function, but in any attempt at accurate work a numerical 
method will always give a more accurate result than a graphical one and with less trouble. 
From observations of the disastrous results of graphical methods in seismology it has 
appeared that graphical methods are liable to be less accurate than numerical ones that 
use only first differences. This seems nearly incredible at first sight, but may be due to a 
defect in most commercial squared paper. The spaces between the edges of the lines are 
uniform, not those between the centres. Consequently there is a systematic difference 
between the small squares adjacent to the thick lines that separate large squares and 
those near the middle of large squares. 


9*10. Numerical solution of differential equations. The simplest method, in 
principle, and one that has come increasingly into prominence recently, is the direct use 
of Taylor’s series. For a second order equation, 

if =f(x,y>y')> 


given that y = y 0 and y’ = y 1 when x — 0, we can calculate y" for x — 0 directly from the 
differential equation. Differentiating the equation, we have 


df df , df „ 

y = £ + & + w y 


in which we can substitute the value of y" just found, and so determine y'". Differentiating 
again we determine y^ and so on to any order desired. The results are substituted in 
Taylor’s series for both y and y\ as follows: 


y = yo+2/i*+^f.... 
y’ = yi+y"i<>)x+^ **+^* s +.... 


These are used to calculate y and y' up to such a value of x that the terms neglected do 
not affect the last figure retained. Let this be h. Then for x = h we have y and y'\ again 
using the relations found by differentiation we determine y”{h), y m {h), y (4) (h), ... and 
form new Taylor series in a; — h. These are used to find values up to x — 2 h. An important 
check is obtained by summing the odd and even powers in the series separately. If we 
have them for x-h = £, their sum gives y for x = A + £; but their difference gives y for 
x = h — £, which is among the values already calculated, and the two calculations for 
h-i should agree. If they do they check the whole of the formation of the derivatives 
and the Taylor expansion about h. By repetition we can proceed to any desired value of a. 

This process was given by J. R. Airey for the solution of Emden s equation 

y”+-y’+y n = °» 

00 
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where y = 1 , y' = 0 at x = 0, and the calculations were done by J. C. P. Miller and 
D. H. Sadler.* It is quite straightforward, and on account of the fact that the numerical 
coefficients in Taylor’s series are much smaller than in any of the finite difference formulae 
it enables a given range of argument to be covered in a smaller number of stages. It can 
be applied to a differential equation of any order; for a first-order equation, of course, 
only a series for y is wanted; for a third-order one, series for y, y' and y". 

As a simple example, take the equation 


dy 

dx 


-xy, 


with y = 1 at x = 0 , the solution of which is exp (— \x*). By successive differentiation 
and substitution for y' at each stage we get 


y” = (x 2 -i)y, 
y m = (3 x-x z )y, 
yW = (3 — 6x 2 + x 4 ) y , 
y 5) = ( — 15a:+10 x z — x?)y, 

= (— 15 +45« 2 — 15a: 4 -f a: 6 )y. 


The series expansion for x small is 

y = l--$«*+^3* 4 -~15aS # +.... 

When x — 0*5 the last term is O-0OO3, and four terms will give four-figure accuracy in 
this range. We find 

x y 

0 1 0-9950 

0-2 0-9802 

0-3 0-9560 

0-4 0-9231 

0-6 0-8825 


We now use our general expressions to work out the derivatives at x = 0*5. With their 
factorial divisors they are: 


y' = - 0-4412, y72! = - 0-3309, y m / 3! = + 0-2022, 

3^/4! = + 0-0575, t/< 5 >/51 = ~ 0-0462, y<Q/6 ! = - 0-0057. 

Now we use £ = x — 0-5 and work out sums of even and odd powers, as follows: 


i 

Even powers 

Odd powers 

X 

Difference 

X 

Sum 

0-1 

0-8792 

-0-0439 

0-4 

0-9231 

0-6 

0-8353 

0-2 

0-8694 

-0-0866 

0-3 

0-9560 

0-7 

0-7828 

0-3 

0-8532 

-0-1270 

0-2 

0-9802 

0-8 

0-7262 

0-4 

0-8311 

-0-1640 

0-1 

0-9951 

0-9 

0-6671 

0-5 

0-8033 

-0-1968 

0-0 

1-0001 

1-0 

0-6065 


The greatest difference from the values previously calculated is 0-0001 at x = 0-1, and 
this entry is the sum of six, all with rounding-off errors in the last place. An error of 
0-0003 is therefore possible. 

* British Association Mathematical Tables, vol. 2, 1932. 
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We next take i, — x — 1*0 and work out derivatives at x = 1*0. 

y' = - 0*6065, y"/2 ! = 0, y m /Z ! = + 0*2022, 

2/(4>/4! = - 0*0505, «/< 5 >/5! = - 0*0303, y^jQ ! = + 0*0135. 

In the next stage we get 


i 

Even powers 

Odd powers 

X 

Difference 

X 

Sum 

01 

0-6065 

-0-0604 

0-9 

0-6669 

M 

0-5461 

0-2 

0-6064 

-0-1197 

0-8 

0-7261 

1-2 

0-4867 

0-3 

0-6061 

-0-1766 

0-7 

0-7827 

1-3 

0-4295 

0-4 

0-6053 

-0-2299 

0-6 

0-8352 

1-4 

0-3754 

0-5 

0-6035 

-0-2788 

0-5 

0-8823 

1*5 

0-3247 


The greatest discrepancy is 0*0002, at x = 0*5 and 0*9. So we may proceed. There is an 
inevitable tendency for rounding-off errors to accumulate. Generally speaking if an entry 
is the sum of m rounded-off values an error of \m in the last figure is possible, but can 
occur only if all have such signs as will produce errors in the same direction and all 
approach the extreme possible. Ordinarily their distribution is nearly random and each 

can be taken as ± —(standard error); the resultant of m will then be ± \ in accord- 

2 y 3 

ance with the usual principles of the composition of random errors. Errors up to this 
will then be usual, and if m is large errors of ■sjil'rn) or a little more may occur in about 
1 entry in 20. If we proceed by series up to sixth powers the basic values for x = 0*5,1 *0,... 
are sums of seven terms, and if integration is carried out to 10 stages an error of 5 units 
may easily accumulate. This is avoided in practice by carrying out computations to one 
or even two places more than are needed in the final answer. Some computers go so far 
as to insist that computations should be carried to such a stage that it can be decided 
with certainty whether the rounded-off figures are 0*499 or 0*501, and round off three 
figures at the end in every entry to avoid the risk that 1 entry in 500 may be rounded off 
in the wrong direction. This is hardly worth the extra labour, but the policy of keeping 
one extra figure is a good one. 

9*11. The Adams-Bashforth method. This starts with the Gregory backwards 
extrapolation formula 

f(a + 6h) = /(a) + 0V/(a) + ^^W/(a) + .... 

Expanding the terms and integrating we find 

\^ a * h f(x)ix =/(a) + (iV+AV» + |V3 + fUV‘ + ^V« + ...)/(o). 

Hence if y ' =f(x,y), 

and we know y and / up to x = a, we can obtain a value for y &tx = a + h from/at a and its 
backward differences. For a given interval the terms decrease much more slowly than 
with Taylor series. Thus, in our example above, the sixth derivative of y at the origin is 
—15. For an interval 0*5 the seventh backward difference of xy will be of order 15 x 0*5® 
or about 0*24, and its coefficient is about |. Hence if this method is used with the same 
interval there will be serious errors in the second decimal. Shorter intervals therefore 
become necessary. We try the same example with interval 0*1. 
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With a finite difference method we always need a few values for a start. Here we know 
that at x = 0, y = 1 , y' = 0, y" = — 1 . We use the first three terms of the Taylor series 
to compute values at ±01, and form xy and its first two differences. Then for the next 
stage we have 

10Ay = - 0-0995 - f (0*0995) - 0 = - 0-1493, A y = - 0-0149. 

This gives y for x = 0-2. We form xy, and now have differences up to V 3 . In the next stage 

10 Ay = -0-1960-^(0-0965)+ ^(0-0030)+ |(0-0030) = -0-2420, Ay = -0-0242. 

Hence y is found for x = 0-3, and we proceed. It saves trouble both in writing and 
reading to keep only significant figures in the differences. 


X 

y 

-0-1 

0-9950 

0 

1-0000 

0-1 

0-9950 

0-2 

0-9801 

0-3 

0-9559 

0-4 

0-9230 

0-5 

0-8824 

0-6 

0-8352 

0-7 

0-7826 

0-8 

0-7260 

0-9 

0-6668 

1-0 

0-6064 


-xy 
+ 0-0995 
0-0000 
-0-0995 
-0-1960 
-0-2868 
-0-3692 
-0-4412 
-0-5011 
-0-5478 
-0-5808 
-0-6001 
-0-6064 


V V a V 3 V 4 


-995 

-995 

-965 

-908 

-824 

-720 

-599 

-467 

-330 

-193 


0 

+ 30 
+ 57 
+ 84 
+ 104 
+ 121 
+ 132 
+ 137 
+ 137 


+ 30 
+ 27 
+ 27 
+ 20 
+ 17 
+ 11 
+ 05 
+ 00 


-3 

-0 

-7 

-3 

-6 

-6 

-5 


The contributions from the second and higher differences are at most in the third 
decimal and can be safely worked out on the slide rule. As the total contribution cal¬ 
culated from xy and its differences is divided by 10 there is virtually only one rounding-off 
error at each stage. But such an error, once made, is carried on through the rest of the 
calculation.® 


9-12. Central-difference method. One difficulty of the Adams-Bashforth method 
is to know how to start. In the above example values of y were computed for x = +0-1 
from the first three terms of Taylor’s series, and a second difference centred on x = 0 was 
thus obtained. But no higher differences could be found for early values without using 
more terms of Taylor’s series. In this example this would be practicable; but it often 
happens that higher derivatives become excessively complicated in form, and their 
calculation is a serious undertaking. This may make the whole method of Taylor’s series 
impracticable; for instance, though it is extensively used in the calculation of mathe¬ 
matical tables, astronomers prefer a quite different method for the computation of 
cometary perturbations. Usually, however, a few terms of Taylor’s series are found and 
used to compute four or five values of the solution, and the rest of the work is done by 
finite differences. 

The other difficulty is in the large coefficients of the higher differences. In the above 
calculation the fourth differences gave effects at each step well within the rounding-off 
error, but as they keep the same sign over several intervals they cannot be safely neglected. 
It was absolutely necessary to keep third differences. Still higher differences would be 
needed if greater accuracy was being attempted. But we know that the coefficients of 
higher differences are much less with central differences formulae than with the Gregory 
formulae, and it is possible to modify the calculation so as to make use of this fact. We 
have simply calculated, for instance, y for x = 0-3 by using y at x = 0-2 and the differences 
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of — xy r unnin g backwards from 0*2; — xy for x = 0*3 is then calculated from the corre¬ 
sponding value of y. We have made no direct use of the fact that the increment of y from 
x = 0-2 to x = 0*3 must be the integral of -xy over the range right up to 0-3. If some 
higher difference that we have neglected actually made an error in the estimated y, we 
should be able to check it by computing an integral from our estimates of -xy; the two 
should agree, but they will not if some high-order differences have been illegitimately 
neglected. When the table is complete such an integral can be found by a central-difference 
formula; but it can also be found as we proceed. We have 


r f f( x ) = lf( a ) + if( a + ^) _ l^^fa+'kh + ris/ l ^ i fa+ 1 kh + • • * • 

hj a 

In this the terms in fi8 2 and fi8 4 will be a small correction. But we cannot use it directly 
to compute the integral because we do not know fi8 2 f a+ y ih until f a+2h is known, and we do 
not know fi8% + Vaft until f a+3h is known. We can, however, proceed by successive approxi¬ 
mation. These terms are small in any case, much smaller than those of the same orders 
in the Adams-Bashforth method. We can therefore extrapolate the last difference retained 
one stage, add the result to the previous difference, thus extrapolating that one, and so 
work up to an extrapolated value of f(a+h). Also by extrapolating two stages we get 
an extrapolated value of 8 2 f a+h . The fourth difference is small and has a very small coeffi¬ 
cient, so that it can usually be neglected; if it cannot we may have to extrapolate three 
stages to determine it. In this way we get trial values of all the quantities needed to 
compute the integral, and in forming it we introduce the further small factor h. Hence 
the value found for y at z = a + h will be very nearly right. We now use it to calculate 
f(a + h) and form corrected differences. With these we recalculate the integral and get a 
much closer approximation. If we have chosen the interval suitably the change will not 

be more than a few units in the last place. 

Returning to the same example, we wnte our first few values as follows: 

x y -xy A A* 

- 0-1 0-9950 + 0-0995 _ g9 - 

0 1-0000 0-0000 _ flq - 0 

+ 0-1 0-9950 - 0-0995 


At this stage we have no means of predicting the variation of A 2 and so our first step is 
to take 8 2 f Q . x as zero. We then have 8f 0 . 15 = -0*0995, / 0 . 2 = -0*0995-0-0995 = -0*1990. 


Then 


10 Ay = £(- 0*0995 -0*1990) = -0*1492, A y = -0*0149. 


We therefore enter y 0 . 2 as 0*9801; this makes -xy = -0*1960 instead of -0*1990, and 
§f is = — 0*0965, +0*0030, 8*f 0 . 0f> = +0*0030. These suggest a revised value 

£ 2 /o-a = + 0-0060, and fi8 2 f 0 . u = + 0*0045. The table, with the necessary corrections, is now 

X y -*y A A a A 3 


- 0-1 
0-0 
+ 0-1 
+ 0-2 


0-9950 

1-0000 

0-9950 


+ 0-0995 
0 

-0-0995 

-0-1960 


-995 

-995 

-965 


0 

+ 30 
( + 60) 


+ 30 
( + 30) 


where extrapolated values are shown in brackets. With the corrected values 


10Ay = £(-0*0995 -0*1960) -^(0*0045) = -0*1481, A y = -0*0148, 
and hence y 0 . 2 = 0*9802. We thus make only a last-figure change in y at the second approxi- 
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mation, even though we had no information at all for our first extrapolation of A 2 . 
There is no further change in — xy to this accuracy. We now have 



X 

y 

—xy 

A 

A* 


0 

1*0000 

0 

-995 

-965 

(-905) 

0 


0*1 

0*9950 

-00995 

+ 30 

Then 

0*2 

0*3 

0*9802 

-0*1960 

(-0*2865) 

( + 60) 
( + 90) 

10 Ay = 

£(- 0*1960 -0*2865) 

- *(0-0075) = - 

0*2418, 

Ay 


A* 
+ 30 


0*0242, y 0 . 3 = 0*9560. 
This gives —xy — — 0*2868 and we correct the differences accordingly. 10Ay is changed to 
£( — 0*1960 — 0*2868) — 0*0006 = -0*2420, 


and we need no further change in y. So we proceed; the final table, after the readjustments 


have been made, is 


X 

y 

—xy 

A 

A* 

0*1 

0*9950 

+ 0*0995 

-995 
-995 
-965 
-908 
-824 
-720 
-600 
-467 
-330 
-193 
- 63 


0 

1*0000 

0 

0 

0*1 

0*9950 

-0*0995 

+ 30 

0*2 

0*9802 

-0*1960 

+ 57 

0*3 

0*9560 

-0*2868 

+ 84 

0*4 

0*9231 

-0*3692 

+ 104 

0*5 

0*6 

0*8825 

0*8353 

-0*4412 

-0*5012 

+ 120 
+ 133 

0*7 

0*7827 

-0*5479 

+ 137 

0*8 

0*7261 

-0*5809 

+ 137 

0*9 

0*6669 

-0*6002 

+ 130 

1*0 

0*6066 

-0*6065 



A* 


+ 30 
+ 27 
+ 27 
+ 20 
+ 16 
+ 13 
+ 4 
0 

- 7 


This method has several advantages over the Adams-Bashforth one. The third differences 
do not appear at all except in so far as they are taken into account in the extrapolated A 2 . 
The coefficient of the second difference is only one-fifth as large. Consequently we can have 
greater confidence that inaccuracy has not crept in through neglect of higher differences. 
There is also a difference at the start. There is a difficulty in all methods about starting the 
integration if derivatives are troublesome to calculate. In both methods we started 
assuming only the first two derivatives at x = 0. In the Adams-Bashforth method this 
gave us a second difference at 0 but at no earlier value; but a third backward difference 
from this was needed to infer y at x = 0*2. This was not available and we had to proceed 
to 0*2, effectively, with the same quadratic formula as was used up to 0*1. If then there 
are terms in x 3 and sc 4 that are appreciable at x = 0*2 but not at x = 0*1, the method will 
give an error there. This could be corrected by working backwards to x = — 0*2 as well, 
but this sacrifices the direct progress, which is the outstanding good point of the method. 
In the central-difference method, on the other hand, we get a first approximation at 
x — 0*2, which is the same as the Adams-Bashforth value, but also have the means of 
correcting it, and actually introduce a last-figure change in y. 

Actually with the central-difference method we do not even need a second difference 
at the start. Suppose that we simply start with the information that at x = 0, y = 1, 
y' — 0. Then since y' = 0 we can take as our trial values at +0*1 the same value of y as 
at x = 0. Then the table at this stage reads 


X y —xy 

- 0*1 1*0000 + 0*1000 

0*0 1*0000 0*0000 

0*1 1*0000 - 0*1000 


A A* 

- 0*1000 
- 0*1000 u 


Then using the formula up to x — 0*1 we have 

10 (^o*i “ 2/o*o) = K 0 ' 0000 - 0* 1000), y ^-y^ = - 0*0060, 
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giving a corrected value y Q . x = — 0-9950. Similarly, we get a value at x — — 0*1, and with¬ 
out ever differentiating the differential equation we have the datum values needed for 
a start. 

The correct values, taken from the British Association Tables, are as follows. The last- 
figure errors of the solutions by the various methods are given for comparison: 


X 

y 

Taylor 

series 

Adams- 

Bashforth 

Central 

differences 

0-0 

1-0000 

0 

0 

0 

0-1 

0-9950 

0 

0 

0 

0-2 

0-9802 

0 

-1 

0 

0-3 

0-9560 

0 

-1 

0 

0-4 

0-9231 

0 

-1 

0 

0-5 

0-8825 

0 

-1 

0 

0-6 

0-8353 

0 

-1 

0 

0-7 

0-7827 

+1 

-1 

0 

0-8 

0-7261 

+ 1 

-1 

0 

0-9 

0-6670 

+ 1 

-2 

-1 

10 

0-6065 

0 

-1 

0 


As far as this example goes there is little to choose between the Taylor series* and the 
central differences method. The error at x = 0-2 in the Adams-Bashforth method, 
which could not be corrected in that method without either working backwards a 
stage or finding another term of the series, is carried on throughout the work. An india- 
rubber is indis pensable for the central-difference method. 


9*121. Equations of higher order. All these methods can be directly adapted to 
equations of higher orders. If our equation is 

y" = Py' + Qy + R(x), 

with at x = 0 y = y 0 , y’ = y x , 


we need only take z = y’ as a new variable and write two first-order equations 

y’ = z, z' = Pz + Qy + R, 

which we can then proceed to solve as before, the initial values of y and z being given. We 
naturally advance by alternate stages in the solution of the two equations, and y and z 
will be found with comparable accuracy. Take, for instance, the equation satisfied by 
the Airy integral 


d 2 y 
dx 2 


= xy. 


We treat this as two, 


dy dz 

Tx = z ’ dx = Xy ’ 


and investigate the solution that makes y = 1,3 = 0 when x = 0. Without using Taylor’s 
series we start with the values not bracketed in the following: 


X z 

-0-1 (0-0050) 

0-0 0-0000 

0-1 ( + 0-0050) 


xy 

- 0-1000 

0-0000 

0-1000 


A A* 

+ 1000 
+ 1000 


y 

1-0000 


z A 

( + 0-0050) 0) 

t+looso, W 


A* 


(+ 100 ) 


For the first step we have jaS 2 (xy) = 0, and 

10(z<m>-Z-o-i) = -i(0* 1000 + 0) = -0-0500, z 0 . x = +0-0050, 
lO^-Zo.,,) = £(0 + 0-1000) = +0-0500, = +0-0050. 


* Which proceeded five steps at a time in comparison with the others’ one. 
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9*13 Solution by extrapolation of 8 2 y 

Proceeding to the right part of the table we have 


10(2/o-o-y-o-i) = 1(0-0050 + 0)-0-0008 = 0-0017, y _ 0 . x = 0-9998, 
WiVo-i-Vo-o) = 1(0-0050 + 0)-0-0008 = 0-0017, y 0 . x = 1-0002. 


Filling in 

these values we have 






X 

z 

xy 

A 

A* 

y 

z 

A 

A 2 

- 0-1 

0-0 

00050 

0-0000 

- 0-1000 

00000 

+ 0-1000 
, + 0-1000 
( + 0 - 1000 ) 

0 

0-9998 

1-0000 

0-0050 

0-0000 

- 50 
+ 50 
( + 150 ) 

+ 100 

0-1 

0-2 

0-0050 

( 0 - 0200 ) 

+ 0-1000 
( + 0 - 2000 ) 

( 0 ) 

1-0002 

( 1 - 0014 ) 

0-0050 

( 0 - 0200 ) 

( + 100 ) 


A {xy) can be extrapolated and gives xy = + 0-2000 at x = 0-2. Then 

10(% 2 - Zo-i) = 1(0-3000), « 0 . 2 = 0-0050 + 0-0150 = 0-0200. 
We enter this on the right and form differences. Extrapolating A 2 z we have 


10(y 0 . 2 -y 0 -i) = 1(0-0250)-0-0008 = 0-0117, A y = 0-0012, y 0 . 2 = 1-0014. 

Multiplying this by 0-2 and returning to the left we have xy = 0-2003 instead of 0-2000, 
so we revise the differences. The change is too small to make any appreciable change in z. 
So we proceed, advancing a stage in each table alternately. The solution up to x — 2-0 
is as follows: 


X 

z 

xy 

A 

-0-1 

0-0050 

-0-1000 

1000 

1000 

1003 

1011 

1029 

1062 

1113 

1188 

1290 

1426 

1603 

1828 

2111 

2461 

2892 

3423 

4070 

4863 

6826 

7001 

8436 

0-0 

0-0000 

0-0000 

0-1 

0-0050 

0-1000 

0-2 

0-0200 

0-2003 

0-3 

0-0451 

0-3014 

0-4 

0-0804 

0-4043 

0-5 

0-1261 

0-6105 

0-6 

0-1827 

0-6218 

0-7 

0-2508 

0-7406 

0-8 

0-3312 

0-8696 

0-9 

0-4252 

1-0122 

1-0 

0-5345 

1-1725 

1-1 

0-6607 

1-3553 

1-2 

0-8066 

1-5664 

1-3 

0-9751 

1-8125 

1-4 

1-1704 

2-1017 

1-5 

1-3972 

2-4440 

1-6 

1-6614 

2-8510 

1-7 

1-9701 

3-3373 

1-8 

2-3321 

3-9199 

1-9 

2-7580 

4-6200 

2-0 

3-2609 

6-4636 


A 2 A 3 y z 




0-9998 

0-0050 

0 


1-0000 

0-0000 

3 


1-0002 

0-0050 

8 

10 

15 

18 

24 

27 

34 

41 

48 

58 

67 

81 

100 

116 

146 

170 

212 

260 

(310) 

1-0014 

0-0200 

18 

1-0046 

0-0451 

33 

1-0108 

00804 

51 

1-0210 

0-1261 

76 

1-0364 

0-1827 

102 

1-0580 

0-2508 

136 

1-0870 

0-3312 

177 

1-1247 

0-4252 

225 

1-1725 

0-5345 

283 

1-2321 

0-6607 

350 

1-3053 

0-8065 

431 

1-3942 

0-9751 

531 

1-5012 

1-1704 

647 

1-6293 

1-3972 

793 

1-7819 

1-6614 

963 

1-9631 

1-9701 

1175 

2-1777 

2-3321 

1435 

2-4316 

2-7580 

(1745) 

2-7318 

3-2609 


A 


-50 

50 

150 

251 

353 

457 

566 

681 

804 

940 

1093 

1262 

1458 

1686 

1953 

2268 

2642 

3087 

3620 

4259 

5029 


A* A* 


100 

100 

101 

102 

104 

109 

115 

123 

136 

153 

169 

196 

228 

267 

316 

374 

445 

533 

639 

770 

(930) 


17 

16 

27 

32 

39 

48 

59 

71 

88 

106 

131 

(160) 


9*13. For a second-order equation with no term in y' other methods are available. If 



y' =f(x>y), 

(1) 

we have 

d*y = (e Dh - 2 + e~ Dh ) y = h 2 D 2 y + {(2 sinh \hD) 2 - h 2 D 2 } y 




(2) 

Also since 

(2 sinh \hD) 2 = 8 2 , 

(3) 

this is 

h f +k ((2sinh -1 £<J) 2 l }f' 

(4) 
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To expand the operator we have 

sinh -1 1 — t—\t 2 + 4-..(5) 

= l +¥ 2 ~ T^ 4 + ^ET* 6 — ...» (6) 

whence h~ 2 8 2 y = /+ &8 2 f-^h^f+ -^oSf (7) 


The coefficient of the fourth difference in this formula is very small. Consequently it 
practically permits a definitive calculation of 8 2 y given / and 8 2 f. Then having, say, 
y{a — h) and y(a), we have Vy{a), and adding 8 2 y we have A y(a) = y(a + h) — y(a). Then 
y(a + h) is found by addition. The complication here is that f(a + h) will have to be 
calculated from y(a + h), and until we know it we do not know 8 2 f(a). But this is easily 
circumvented. If, for instance, h — 0-1, j^h 2 8 2 f is about and will not affect the 

fourth place of decimals if f" is less than 0-0600. Further, if we can infer f within 0-0600 
from entries further up the table, we can use this approximate value in (7) and still get 8^y 
right within the rounding-off error. We then proceed by addition to infer y(a + h); from 
this we calculate f(a + h), and form the differences of the latter accurately. If necessary 
we can correct the approximate value taken for f* and repeat the calculation. In most 
cases, if the interval is suitably chosen, it will rarely be found that further revision 
changes 8 2 y by more than a unit in the last place. 

It is slightly more convenient to extrapolate j \8 2 f than 8 2 f, since it is going to be multi¬ 
plied by the small quantity h 2 in any case. But if we proceed directly to y in this way we 
must pay special attention to the initial conditions. If in the above example we worked 
to the fourth decimal, the values of y at x = 0 and 0-1 would only determine y' there 
within 0-0005, and to this extent the solution would be uncertain by 0-0005 times the 
solution that makes y — 0, y' — 1 at x = 0. In this case the solution in question reaches 
3-6 at y = 2-0, so that for this reason alone an error of 18 in the fourth place might accu¬ 
mulate. This is avoided by the method already used, since this attends to y' explicitly 
and takes its value at x = 0 as a starting point. But if we use the present method we can 
save the situation only by keeping an extra figure in the calculation. 

To treat the example of 9-121 in this way we notice that the first two terms of the 
Taylor series are y »!+%&+.... 

We write down a few values of y from this formula: 


X 

y 

A 

A* 

xy 

A 

A* 

fsd* 

-0-2 

0-99866 

+ 117 
+ 17 


-0-19973 

+ 9976 
+ 9998 



-01 

0-99983 

-100 

-0-09998 

+ 23 

2 

0-0 

1-00000 

0 

0-00000 

+ 4 

0 

01 

1-00017 

+ 17 
(+117) 
(317) 

(+100) 

+ 0-10002 

+10002 
(10025) 
(10108) 

( + 23) 

2 

0-2 

0-3 

(1-00134) 

(1-00461) 

( + 200) 

(0-20027) 

(0-30135) 

(83) 

7 


We can ignore ^8 2 {xy) in proceeding to the next stage. We have 
lOOa 2 ^.! = 0-10002 + 0, = 0-00100. 

We enter this in its place in the differences of y, add to #Ay 0 . 05 to give + 0-00117, and add 
this in turn to y 0 . t to give y 0 . a = 1-00134. We then work out xy with this value, 0-20027, 
and form its differences. We now see that T%8 2 (xy) should be 2 in the fifth decimal, but this 
will affect nothing in the calculation yet made, which can therefore be taken as confirmed. 
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913 


Example of method 


At the next stage we provisionally try ^8 2 (xy ) 0 . 2 = 0*00004; then 


100^ 0 .2 = 0*20027 + 0*00004, 8*y M = 0*00200, 


and we enter this and work up to 2 / 0 - 3 * Calculate xy as before and form its differences. We 
now find ^8\xy)o. 2 = 0*00007. Again the change does not affect the last figure of y. 

We proceed with the calculation; 8 2 {xy) increases until it does begin to affect the 
extrapolation of y, and then to a stage when a supplementary table of its differences 
becomes worth while to assist its extrapolation. Thus at a later stage we have 


• 

y 

A 

A* 

1*4 

1-50089 

12805 

15254 

18111 

(21455) 

2106 

1-5 

1-62894 

2449 

1-6 

1-78148 

2857 

1-7 

1-8 

1-96259 

(2-17714) 

(3344) 


xy 

A 

A* 

rtS* 

A 

A* 

2-10125 

2-44341 

2-85037 

34216 

40696 

48603 

(58245) 

5298 

6480 

7907 

442 

540 

659 

98 

119 

16 

21 

3-33640 

(3-91885) 

(9642) 

(804) 




From the differences of -^8 2 (xy) we infer that its next second difference is likely to be 
about 26, the next first difference therefore 145, so in the next stage we try 


^8 2 (xy) = 659 + 145 = 804. 

Then lOOtf 2 ^., = 3*33640 + 804 = 3*34444; 

and we enter 3344 as the second difference of y. The next steps, in order, are 


A y vl = 0*21455, y M = 2*17714, (xy) v6 = 3*91885, 
A(xy) v7 = 0*58245, 8 2 (xy) v7 = 9642, ^ s2 = 804. 


This agrees with the trial value and no change is needed. The complete table is as 
follows: 


X 


A A 2 


- 0-2 

- 0*1 

00 

01 

0*2 

0-3 

0-4 

0-5 

0*6 

0-7 

0-8 

0*9 

1*0 

11 

1*2 

1*3 

1*4 

1*6 

1*6 

1*7 

1*8 

1*9 

20 


0*99866 

0-99983 

1-00000 

1-00017 

1*00134 

1-00461 

1-01070 

1-02094 

1-03629 

1-05786 

1-08684 

1-12453 

1-17236 

1-23193 

1-30507 

1-39390 

1-50089 

1-62894 

1-78148 

1- 96259 

2- 17714 
2-43098 
2-73113 


117 

17 

17 

117 

317 

619 

1024 

1535 

2157 

2898 

3769 

4783 

6957 

7314 

8883 

10699 

12805 

15254 

18111 

21455 

25384 

30015 


-100 

0 

+ 100 
200 
302 
405 
511 
622 
741 
871 
1014 
1174 
1357 
1569 
1816 
2106 
2449 
2857 
3344 
3929 
4631 
(5477) 


xy A 

-0-19973 
-0-09998 
0-00000 
0-10002 
0-20027 
0-30135 
0-40428 
0-51047 
0-62177 
0-74050 
0-86947 
1-01208 
1-17236 
1-35512 
1-56608 

1- 81207 

2- 10125 
2-44341 

2- 85037 

3- 33640 

3- 91885 

4- 61886 

5- 46226 


A* A# 2 


23 

2 

4 

0 

23 

2 

83 

7 

185 

15 

326 

27 

511 

43 

743 

62 

1024 

85 

1364 

114 

1767 

147 

2248 

187 

2820 

235 

3503 

292 

4319 

360 

5298 

442 

6480 

540 

7907 

659 

9642 

804 

11756 

980 

14339 

1195 

(1457) 


9975 

9998 

10002 

10025 

10108 

10293 

10619 

11130 

11873 

12897 

14261 

16028 

18276 

21096 

24599 

28918 

34216 

40696 

48603 

58245 

70001 

84340 


A A* 


29 

33 

40 

48 

67 

68 
82 
98 

119 

145 

176 

215 

262 


11 

14 

16 

21 

26 

31 

39 

47 


The solution differs at x = 2*0 by 0*0007 from the previous one. This is about the difference 
that might be expected from the accumulation of rounding-off errors. The correct solution 
found from power series is 1*17230 for x = 1*0, 2*73088 for x = 2*0. 

With regard to the neglected term in 8*(xy), this also may accumulate. But if we sum 
(7) we see that the total contribution of 8\xy) will be about h 2 j 240 times the change of 
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8 3 {xy), and by inspection of the table this change is about 3000 in the fifth decimal. Hence 
the total effect of the 8* term in the range is about 1 in the sixth decimal and is correctly 
neglected. 

9*14. The Gauss-Jackson method. The last method was used by Cowell and Crom- 
melin in their work on the motion of Halley’s comet between 1759 and 1910, and has been 
extensively used in the solution of Schrodinger’s equation, particularly by D. R. Hartree.* 
Cowell, however, recommended a slightly different method in an appendix.f This is 
discussed further by J. Jackson, J who remarks that the matter had been left in a prac¬ 
tically perfect state by Gauss. The procedure is to introduce a function whose second differ¬ 
ences aref. If we have such a function we can denote it by 8~ 2 f; and then (7) can be written 

h-*8*y = 8*(S-*f+ */-^ 2 /+ eoVso^ 4 /- ••.)> (8) 

and the two functions 

h~*y, 8~ 2 f+ T \f-^8 2 f+ m ^f-... (9) 

have the same second differences. But a differential equation of the second order needs 
two adjustable constants to specify its solution; and any function of the form A + Bx 
will give zero second difference. Consequently we are at liberty to add A + Bx to 8~ 2 f and 
choose A and B so that the expressions (9) will be equal for two values of a;. Then since the 
second differences are equal the functions are equal for all tabulated values of a;, and 

h~ 2 y = 8~ 2 f + T ^f- aio S *f+ emo^f- .... (10) 

The advantage of this procedure is that the'summation to get 8~ 2 , once we have started, 
can be done exactly, and each rounding-off error in the correcting terms of (10) arises only 
once, and, with h = 0-1, is divided by 100 before it is passed on to the next stage of the 
calculation. 

To make a start with the solution of the same equation as before we fit the values 
already found at x = 0 and 0-1. We have 


100-000 = 8~ 2 f 0 . 0 + 0, 100(1-000167) = 8~ 2 f „ +*(0-10002), 
whence 8~ 2 f 0 -i = 100-0083. 

The calculation is now straightforward.§ We enter the table as follows: 


X 

0-0 

01 

0-2 

0-3 

0-4 

0-5 

0-6 

0-7 

0-8 

0-9 

1-0 


<y- a / 

100-0000 

100-0083 

100-11662 

100- 42521 

101- 03515 

102- 04937 

103- 57405 
105-72051 
108-60744 
112-36381 
117-13221 


*-'f f 


0-0083 

0-10832 

0-30859 

0-60994 

1-01422 

1- 52468 

2- 14646 

2- 88693 

3- 75637 

4- 76840 


0-00000 

0-10002 

0-20027 

0-30135 

0-40428 

0-51046 

0-62178 

0-74047 

0-86944 

1-01203 

1-17230 


lOOy 

100-000 

100-017 

100-133 

100- 450 

101- 069 

102- 092 

103- 626 
105-782 
108-680 
112-448 
117-230 


1-0 

1-1 

1-2 

1-3 

1-4 

1-5 

1-6 

1-7 

1-8 

1- 9 

2 - 0 


< 5 - 2 / 

117-13221 

123-07291 

130-36866 

139-23040 

149-90409 

162-67889 

177-89692 

195-96509 

217-36939 

242-69222 

272-63351 


*-7 / 


5-94070 

7- 29575 

8- 86174 
10-67369 
12-77480 
15-21803 
18-06817 
21-40430 
25-32283 
29-94129 


1-17230 

1-35505 

1-56599 

1- 81195 

2 - 10111 
2-44323 

2- 85014 

3- 33613 

3- 91853 

4- 61846 

5- 46178 


100y 

117-230 

123-186 

130-499 

139-381 

150-079 

162-882 

178-134 

196-243 

217-696 

243-077 

273-089 


* Manchester Lit. and Phil. Mem. and Proc . 77, 1933, 91-107; D. R. Hartree and W. Hartree, 

Proc. Roy. Soc. A, 150, 1935, 9-33; 154, 1936, 588-607; 156, 1936, 45-62; 166, 1938, 450-64. 

f Greenwich Observations, 1909. J Mon. Not. R. Astr. Soc. 84, 1924, 602-6. 

§ With no extra trouble we could find an extra figure in <y- 2 / 0<1 , and this would improve the 
accuracy. This has not been done, however, so as to have a fairer comparison with the other 
methods. 
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To start with, we difference the first two values of 8~ 2 f to get £ -1 / 0 . 05 . / is known at 0-1, 
and we add it to £ -1 / 0 . 05 to get £ -1 / 0 is- This is then added to 8~ 2 f 0 . x f° give 8~ 2 f 0 . 2 . Then 
yo. 2 is given by the equation 

100y= 100-1166 + */ 0 . 2 . 

By extrapolation we try/ 0 . 2 = 0*200, and the correcting term is +0*0167, making 
100^ = 100*133. Multiplying this by 0*2 we have/ 0 . 2 = 0*20027, and the change makes 
no change in the third decimal of 100y. If there was a change at any stage we should 
continue the approximation till there is none. It is convenient to extrapolate jjf at each 
stage, and not to fill in / till the second approximation to save rewriting. 

The fourth decimal in 8~ x f and the third in 8~ 2 f have little direct importance, but it is 
as easy to write them in as not, and they enable the rounding-off errors to be absorbed 
harmlessly in a place where they will be divided by 100 before y is calculated. The result 
for x = 2*0 is y — 2*73089, which is 1 unit of the fifth place from the correct value; and 
the amount of subsidiary calculation is less than with either of the other methods. It is 
not even necessary to write in the differences of / and y, since neither affect the calcula¬ 
tion. The approximation for y, however, is of a kind such that a mistake is likely to be 
repeated at the next approximation, and differences should be used as a check. Occasional 
inspection of the second differences of / is also desirable in case their contributions should 
become appreciable; but they must reach 120A -2 in the last place retained for them to 
matter, and if they do it is less trouble to use a shorter interval. Special attention should 
be given to the calculation of the first two values of 8~ 2 f, because an error in their differ¬ 
ence produces an error that may increase steadily throughout the calculation. As soon as 
four or five values of y have been found they should be differenced to check this stage of 
the calculation. 

The possibility of using this method depends on the absence of a term in dy/dx from the 
differential equation.® The convenience of the method is such that when such a term is 
present in a linear equation it is best to begin by transforming the equation so as to remove 
it. Astronomers, in computing perturbations, therefore largely prefer to use rectangular 
coordinates, even though this involves sacrificing the use of the elliptic orbit as a first 
approximation. The component accelerations due to the sun are included in the numerical 
computation at each stage and treated just like the planetary terms. This inconvenience 
is far more than compensated by having to deal with differential equations of the form 

d 2 x r , . 

instead of, for example, in polar coordinates, 

j t (sin 2 OX) = g{x x , x 2 ,..., x n ). 

If f(x, y) varies considerably within the range of integration it may be convenient to 
increase or reduce the interval. To change from interval 0*1 to 0*2 at x = 2*0, we should 
use the values already found for y x . 8 and y 2 . 0 to find corresponding values of 8_ 2 f, and 
start afresh. To change from 0*1 to 0*05 would require first the interpolation of a value 
for y x . 95 , and then the calculation of 8~ 2 f x . 95 , 8~ 2 f 2 . 0 . The latter will not be the same as for 
the original interval. 
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9*15. Estimation of an eigenvalue. This method of solution is conveniently 
combined with Rayleigh’s principle to give a rapidly converging series of approximations 
to, for instance, the period of a dynamical system. Consider for instance the oscillations 
of water in a narrow lake of elliptical plan. If £ is the elevation of the water surface, u the 
velocity, h the depth, g gravity, and b the breadth, the equations for a small oscillation of 
period 2 nly are 

* X ^ (1) 


dt 9 dx* 




Put hbu = V; 

then on elimination in favour of V we have 


( 2 ) 



+ K *y — o 
dx[bdx) + y ’ 

( 3 ) 

where 

k 2 = y 2 lgh. 

(4) 

The boundary conditions are that V = 0 at the ends. We can remove the term in 

dVjdx 

by the substitution 

V = b*U; 

(5) 

then 

d 2 U l . b" 36' 2 \ rT „ 
dx 2 + r + 26 46 2 ) U ~ °* 

(6) 

With 

feoc (i —x 2 yk. 

(7) 

this is 

U !+l *\u 

dx 2 \ 2(1 -x 2 ) 2 ) 

(8) 


It can be shown* that the two solutions near x = ± 1 make U behave like (1 — or 
( l—x 2 ) 0/ <. The former would make V different from zero at x = +1, and must therefore 
be excluded. The problem is to find the values of k 2 that make it possible to avoid this 
solution at both ends. By symmetry U must be either an even or an odd function of x. 
The mean kinetic energy over a period is given by 

4 T = jbhu 2 dx = dx, (9) 

and the mean potential energy by 

4IT-J*^-J£g)\b. (10) 

Using the principle that the mean kinetic and potential energies in a period are equal 

Rayleigh’s principle is that any form of V satisfying the boundary conditions, but not 
the differential equation, will give a second-order error in k 2 when substituted in (11). 

It is fairly clear that the lowest value of k will be such that dV/dx is, on the whole, as 
small as possible for given mean V 2 ; and therefore that V keeps the same sign for all x. 
The next lowest value will make V change sign once, and so on. 


* Cf. Chapter 16, 
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For each trial value of k 2 we first make a table of the coefficient of U in (8). We take 
U = 1*00000, dU/dx = 0 at x — 0, and have for small x 

U = 1-\{k 2 -\)x 2 +^{k 2 -\) 2 +1}x* (12) 

which gives U for x = 0*1 and hence 8~ 2 f, where / is the right side of (8). 

The solution is then developed. An incorrect value of k 2 will be shown by the solution 
tending to ± oo at x = 1. Specimen solutions are as follows: 


X 

U ( k 2 = 3 ) 

U ( k * = 4 ) 

X 

E 7 ( k 2 = 3 ) 

U ( k 2 = 4 ) 

0*0 

1 0000 

1-0000 

0-5 

0-7129 

0-6024 

01 

0-9875 

0-9826 

0-6 

0-6028 

0-4515 

0-2 

0-9506 

0-9310 

0-7 

0-4862 

0-2912 

0*3 

0-8908 

0-8478 

0-8 

0-3718 

0-1288 

0-4 

0-8104 

0-7365 

0-9 

0-2781 

- 0-0292 


With the information that the solutions near x = 1 should be of the form 

A(\ — x)* 1 * + J5( 1 — x)- 11 * 


we can find A and B roughly from the last two entries in each table. We find 

k 2 = 3, A = + 15-5, B = +10-7, 

k 2 = 4, A = +17*2, B = - 7-1. 

Hence the former solution will make U tend to +oo, the second to — oo, as x-> 1. By 
interpolation B should vanish with k 2 about 3*6. A pair of trials for 3*4 and 3*6 suggested 

3*56; but at this point it seemed that intervals of 0*05 instead of 0*1 would be safer in 

testing the behaviour near x = 1. The solution with this interval is as follows: 


it 8 a 3-56 


X 

U 

X 

U 

X 

U 

0 

1-0000 

0-35 

0-8210 

0-70 

0-3764 

0-05 

0-9962 

0-40 

0-7693 

0-75 

0-3036 

0-1 

0-9850 

0-45 

0-7127 

0-80 

0-2327 

0-15 

0-9662 

0-50 

0-6510 

0-85 

0-1643 

0-20 

0-9401 

0-55 

0-5867 

0-90 

0-1000 

0-25 

0-9069 

0-60 

0-5175 

0-95 

0-0428 

0-30 

0-8671 

0-65 

0-4471 




If the solution was correct the ratio of the last two entries should be nearly 2 B I* = 2*378. 
It is actually 2*338, so that we are very close. To get a better approximation we use 
Rayleigh’s principle. We multiply U by (1 — x 2 ) 11 * to get F, differentiate numerically, work 
out (1 — x 2 )~ 1/a (dV/dx) 2 , and integrate. As the integrand behaves like (1 — x 2 ) l,i near x = 1 
it is best to use the formulae of 9*092 from 0*90 to 1*00, and the Gregory formula up to 
0*90. This gives 

C 1 (dV\ 2 


The integration of U 2 is simple except again for values beyond 0*9. For these we 
assume that 

i e /a 


U 2 = 0*0100 
*10 


whence 


£ 

j: 


(l-x\*‘* 

l o-l ) ’ 


U 2 dx = 0*0004 


U 2 dx = 0*4696. 


and 
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] *6287 

Then k 2 = —- = 3-544; k = 1-882. 

0-4596 

Previous solutions had given k = 1-886 by totally different methods.* 

The solutions for k 2 = 3 and k 2 = 4 were not really necessary to the method, but are 
given to show how we can detect with a wrong value that the solution is not tending to 
zero in the way it should. Rayleigh’s solution will usually give an accuracy within a few 
per cent with an assumed form that is even roughly near the truth, and it would have been 
possible to apply it at the start with U = (l—x 2 )* 1 *. This gives /c 2 ==3-6 immediately. As 
the error is squared at each stage it should be possible to get four-figure accuracy with at 
most three solutions. 

Alternatively, we could assume a solution (1 — x 2 )* 1 * (1 + Ax 2 + Bx*) and determine 
A, B to make k 2 , as found from (11), stationary for small variations of A, B. A similar 
method in principle was used by Ritz to determine the normal modes of vibration of a 
square plate. 

To treat the second mode it is desirable to begin by subtracting from the trial solution 
such a multiple of the solution for the lowest mode as will make the remainder exactly 
orthogonal with the latter solution (cf. 6-08).f 

9*16. Numerical solution of simultaneous linear equations. The methods 
usually given are unnecessarily laborious. We shall illustrate the solution by an example. 
Take the three equations 


6-3*—3-2 y+ 1-0z = +7-8, 

(1) 

— 3-2*+8-4?/ — 2-6z = —2-3, 

(2) 

+ 1-Ox —2-6y + 5-7z = +8-6. 

(3) 


The coefficients here form a symmetrical matrix: this is not necessary to the method, but 
in practice the condition is so often satisfied that we may as well take an instance of it. 
Divide the first equation by the coefficient of x, and then multiply the resulting equation 
by the coefficients of x in the other two. By addition or subtraction we then eliminate x 
and proceed. The complete solution is arranged as follows: 

6-3z —3-2y+l-0z = +7-8 cc-0-508y + 0-159z = +1-238 

— 3-2aj+8-4y —2-6z = -2-3 3-2a;-l-63y +0-51z =+3-96 

+ l*0a; —2-6y + 5-7z =+8-6 x — 0-51y +0-16z =+1-24 

6-77y —2-09z = +1-66 «/-0-309z = +0-245 

- 2-09y + 5-54z = + 7-36 2-09y - 0-65z = + 0-51 

4-89z = + 7-87 z = + 1-609 

y = +0-245 + 0-497 = +0-742 
a; = 1-238 + 0-377 - 0-256 = + 1-359 

* Jeffreys, Proc. Lond. Math. Soc. ( 2 ) 23 , 1924 , 463 . Goldstein, Proc. Lond. Math. Soc. (2) 28 , 
1928 , 95 . 

f For further discussion of the Ritz method see Temple and Bickley, Rayleigh's principle, 
pp . 150 - 2 . 
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Check by substitution: 


Seidel's method 


305 


8*56-2*37 + 1-61 = +7*80, 

- 4*35 + 6*23 - 4*18 = - 2*30, 

+1*36 -1*93 + 9*17 = + 8*60. 

The check shows that the solution is right to about 0*001. 

When there are more than three or four equations a mistake will usually be made, 
and it is desirable to be able to detect it before reaching the final check by substitution. 
This can be done in two ways. (1) If the unknowns are x x , x 2 ,..., x n we can first eliminate 
x x and then x 2 by the above method. Then transpose the first two equations and eliminate 
first x 2 and then x x . The whole of the coefficients in the simplified equations for x 3 , x^,... 
should be the same for both methods, and the place where any inconsistency occurs 
indicates at once a small set of steps where a mistake can have occurred. (2) If all the 
coefficients are calculated to the same number of decimals, we can take their sum and 
perform the same operations on the sums. It does not matter whether we reverse the 
sign of the term on the right so long as we always do the same. Thus 

6*3 —3*2 +1*0+ 7*8 = +11*9, 11*9/6*3 = +1*889, 

1*000-0*508 + 0*159 +1*238 = +1*889, 
which checks the first division. Next, 

-3*2+ 8*4-2*6-2*3 = +0*3, 1*889x3*2 = 6*04, 

0*3 + 6*04 = 6*34, 6*77 - 2*09 + 1*66 = 6*34, 

which checks the elimination of y. In this method the check sum is written in an extra 
column to the right of its equation, and any mistake can be detected by adding coefficients. 

An alternative method due to von Seidel can be used when the matrix of the coefficients 
on the left is that of a positive definite quadratic form. To illustrate it on the above set 
of equations we write them in the form 

x = + 1*238 + 0*508?/ — 0* 159z, (4) 

y — —0*274 + 0*381a; + 0*3102, (5) 

z = +1*509 -0*175z + 0*456y, (6) 

and solve by successive approximation. The method requires for its rapid convergence 
that the coefficients of x, y, z on the right shall be fairly small; in these equations they are 
a little too large to show the method at its best. We first neglect y and z in (4) and take 
the first approximation x = +1*238. Substitute this on the right of (5). This gives 
y = +0*198. Substitute both these values in the right of (6); we get z = + 1*382. 

Now return to (4) and substitute y = +0*198, z = +1*382; we have now x — +1*119. 
Proceeding, we get approximations in turn as follows: 


X 

+ 1-238 

1*119 

1-282 

1-340 

1-351 

y 

+ 0-198 

0-580 

0-703 

0-735 

0-740 

z 

+ 1-382 

1*577 

1-606 

1-610 

1-608 


The change from the third approximation to the fourth is small. Apart from the formation 
of (4), (5), (6) the method is iterative and checks itself. It is seen at its best when the 
number of equations is large and the non-diagonal coefficients are all small and many 
of them zero. 


JMP 


20 
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The point of the method is that the solution is the set of values that make the function 

8 = 3- 15r» 2 — 3*2 xy +1*0 xz + 4-2 y 2 — 2-6 yz + 2-85z 2 
— 7*8# + 2*3y — 8*6z 

a minimum The minimum exists because the quadratic terms are positive definite. 
Let it be 2, and let the corresponding values of x , y, z be x 0 , y 0 , z 0 . When we adjust any 
unknown we find the value that makes 8 a minimum given that the other unknowns 
have the values taken in the previous approximation. Hence the values of 8, say S n , 
corresponding to successive approximations form a non-increasing sequence. Also, unless 
all the equations are satisfied, adjustment of at least one of x, y, z will reduce 8 n ; hence if 
S n > 2, S n+3 < 8. Put S n - S n+3 = T n \ then, if we consider all values of x, y, z that give the 
same value of 8 n , T n is continuous and takes its lower bound, which therefore cannot be 
zero. Also, if we put x — x 0 + x', y — y 0 + y', z — z 0 + z', 8 — 2 + S', T n is a positive defimte 
quadratic in x' n , y' n , z' n . Let the lower bound of T n when x', y', z' vary, S n being constant, 
be aS' n . Then 1 ^ a > 0. If x' n , y' n , z' n are all multiplied by the same constant factor k> 
8' n and T' n are both multiplied by k 2 . Thus for any set of values x' n , y' n , z' n , the approximations 

after three more steps will give 

S' n+3 = S' n -T n ^(l-a)8' n . 

Since {$*} is non-increasing, it follows that $^->0. Also the successive inequalities 8 ' < S' n 
specify a set of regions of x , y, z each contained in the preceding, with diameters tending 
to zero. Hence the values of x , y, z given by the process converge to values corresponding 
to S' = 0, that is, to the correct solution. a 


9*17. Jury problems: ordinary differential equations. For an ordinary differ¬ 
ential equation of the second order we may either have given values of y and dyfdx at 
one terminus, or a value of one of them at each of two termini. In the former case we form 
the numerical solution by proceeding one step at a time; this has therefore been called a 
marching problem by L. F. Richardson. The latter type of problems are called jury pro¬ 
blems. For jury problems, when the equation is linear, we can make the solution depend 
on those of two marching problems from one terminus, which are then combined linearly 
so as to satisfy the condition at the other. This method fails for non-linear equations. It 
is possible to obtain a first approximation that satisfies the terminal conditions, and 
then the differential equation can be converted into a finite difference equation and used 
to obtain better approximation to the intermediate values. As a simple case, take the 
equation 



( 1 ) 


with y = 0 at x - 0, and y = 1 at x = 1. We know that the solution is 


y sin (1 radian)’ 

But suppose that we did not know this. Try to interpolate a value of y at x = 0*5. We have, 
using second differences at interval 0*5, 
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whence 7y 0 . 5 = 4^ x + 4^ 0 = 4, (4) 

Vo* = 0-57. (5) 

Now interpolate to intervals 0*2 by second differences. This gives the first approxima¬ 
tion y x \ 


X 

Vi 

y» 

y» 

y* 

Correct y 

0 

0 

0 

0 

0 

0 

0-2 

0-245 

0-239 

0-238 

0-237 

0-236 

0-4 

0-467 

0-468 

0-466 

0-464 

0-463 

0-6 

0-667 

0-672 

0-674 

0-673 

0-671 

0-8 

0-845 

0-851 

0-853 

0-854 

0-853 

10 

1-000 

1-000 

1-000 

1-000 

1-000 


With intervals 0*2, (3) is replaced by 

2 5(y- h - 2 y 0 +y h ) = - y 0 , (6) 

that is y 0 = 0*5102(y_ A + y A ). (7) 

With the values of at x = 0*6 and 1*0 this gives a second approximation for y at 0-8, 
namely, 0-851. This, with y x at 0-4, gives y 2 at 0-6 equal to 0-672. We thus get the column 
y t ; further similar approximations give y z and y 4 . The correct values are given in the last 
column. Fourth differences can be taken into account if required, but it is then necessary 
to continue the solution one place beyond the ends of the table. 

9*18. Relaxation Methods. Partial differential equations with given boundary 
conditions can be treated by an extension of the method. As an example we take Laplace’s 
equation. Suppose that a solution is expressed in the form 

<f> = a 0 + a x r cos 6 + b x r sin 0 + a 2 r 2 cos 20 + & 2 r 2 sin20-i-... + 6 4 r 4 sin40. (1) 

Suppose further that we are given the values of 0 at the points (1,0) (0,1) (— 1,0) (0, — 1). 
Denote these by 0 1} 0 2 , 03 ? 0 4 - Then we take the coefficients up to b 2 as unknowns and try 
to adjust them so as to make the sum agree with these values as closely as possible, judged 
by the sum of squares: that is, we make 

(a 0 + a i + a 2 — 0i) 2 + (®o "b — ®2 — fit) 2 + (®o ~ a i "h ®2 — 03) 2 -h ®2 04) 2 (2) 

a minimum . The condition for a minimum, with regard to a 0 , is 

^0 = a 0 = 1(01 + 02 + 03 + 04)- (®) 

Now consider the set shown in the next diagram and retain 
terms to & 4 . (The point marked 5 is (2, 0) and so on.) Forming 
a sum of squares similarly we find that the conditions for a mini¬ 
mum, so far as they contain a 0 , are 

4 a 0 + 34a 4 = \{(j) x +... + 0 8 ) 

34a 0 +514a 4 = i(0i + 02+03 + 04) + 8(05 + 06+ 07 +0s) 

a 0 = A(01 + 02 + 03 + 04) — ¥fr(05 + 06 + 07 + 0s)- (*) 



whence 


20-2 
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For the set in the next figure, we can again retain terms to 6 4 ; but it is obvious that 
the estimate of a Q depends only on the sums 


$1 = (01 + 02 + 03 + 04 )> ^2 ~ (05 + 06 + 07 + 08 ) 

and we shall get the same value for a 0 by taking mean values 

01 = 02 = 03 = 04 = 05 = 06 = 07 = 08 = 

This makes a x = b 1 = a z = b 3 = 0, 

= a 0 + ®4 


(5) 

( 6 ) 



Hence, irrespective of a 4 , 


i^ 2 = a 0 - 4a 4* 
a 0 ~ i^l + ^0^2- 


(?) 


If 0 satisfies Laplace’s equation in a region, and we want its value at a point of the region 
given those at surrounding points of a rectangular network, (3) will give an approximation, 
which is simply the mean of the values at the four adjacent points. It takes account only 
of terms in r 2 ; the formulae (4) and (7), which are accurate to r 4 , will be substantially more 
accurate. 

The procedure will then be to take a trial set of values over a rectangular network so as 
to satisfy the boundary conditions, and to adjust them in turn. 

Special attention is desirable at corners, where the appropriate expansion of 0 will not 
be of the form (1). In the problem we shall consider in a moment we 
have the distribution shown. Take 0 to be 0 over the boundary and 
to have given trial values at the points marked 2, 3, 4, 5, 6, where the 
values for 2 and 6, and for 3 and 5, are equal. Then the appropriate form 
of 0 near the corner is 

0 = Ar 2ls sin § 0 + Br 2 sin 20 (8) 

and the correct values will be 
Point 

4 A . 2 1/s sin \tt — 2B = 1 *26(L4 — 2J5, 

3.5 ^4sin^7r = 0*866.4, 

2.6 A. 2% sin ±n + 2 B = 0*630.4 + 2H.. 

It will in general be impossible to find A and B so as to fit three datum values for 0, but 
we can adjust them by least squares to give the best fit as a whole and then use the solution 
as a smoothing function. If the trial values are 0 4 ,0 3 , <j> 2 we get the minimum sum of squares 
of residuals at the five points by taking 

A = 0*3247^ + 0*4463^ + 0*3247^ 

B = 'g'02 IT04* J 

If the adjusted values are 0 4 , <p' 2 , 

= O*7420 4 + O*5620 3 - O*2580 2 ,' 

0' = O*2810 4 + 0*38603 + 0*28102, * 

0'= - 0*12904 + 0*28103 + 0 * 87102 .. 



(9) 


( 11 ) 
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These can be checked by taking fa, fa,fa satisfying (9) exactly and verifying that fa 2 , fa, 0 4 
are equal to them respectively. The adjustment does not assume that the term in 
r i( Vs sin-^0 is zero, but that it is small, and distributes the errors arising from its presence 
among fa, fa, fa) but close to the corner the smallness of this term will permit a good 
adjustment. 

We can use (8) to halve the interval. If we take points 7, 8, 9 bisecting the lines joining 
the origin to 2, 3, 4, we have 

fa = 03968.4 + \B = 0-295503 + O-17710 3 + 0-0455 fa,' 

fa = 0-54564. = 0-177102+O-24350 3 +O-17710 4 , - (12) 

fa = 0-7937 A-\B = 0-091002+0-354203+ O-341O0 4 . y 

Even if higher terms are not negligible at 2, 3, 4, they will be much reduced at 7, 8, 9. 

The process gives a rapid smoothing of departures from the true solution so far as 
they produce differences between the values of 0 at adjacent points of the net. For 
departures that have the same sign at a block of neighbouring points the adjustment is 
much slower, and the process may appear to have converged sufficiently when in fact 
considerable errors survive. (This was true in the example given in the first edition of this 
book.) This is treated by Southwell by a method known as block adjustment. If we consider 
the sum 

(00 ~ 01 + #) 2 + (00 “ 02 + <*) 2 + (00 ~ 03 + <*) 2 + (00 ~ fa + <*) 2 , (13) 

it is made a minimum for variations of 8 by taking 

^ = i(0i + fa + 03+04” 4 0o)- (14) 

Hence fa + 8 is the value of a 0 given by (3). In general the process of adjustment by (3) is 
equivalent to minimizing S(0 r —0 S ) 2 , where r and s indicate adjacent points of the net. 
Now suppose that we have a block of trial values and we wish to apply a unif orm 
correction 8 to all values within it, leaving the values outside it unaltered. Then the 
only terms in the sum that contain 8 are those where r indicates an edge point of the 
block and s an outside point adjacent to it on the net, and these may be written 

%{fa-fa + 8) 2 , 

taken over all such pairs of points, say N in number; and the condition for this to be 
a minimum is 

^ = ^ s (0*-0r)- (15) 

This correction is applied to the whole of the block. 

In addition to the improvement made by block adjustment, we now discuss a method 
by which the convergence can often be made more rapid than that obtained by a naive 
use of (3). If 0 at a point of the net differs by — 8 from its mean value at neighbouring 
points, and we simply apply a correction 8, then at the next approximation at the neigh¬ 
bouring points 0 will be increased by £8. Thus the difference is not removed, but only 
divided by 4, and further corrections will be needed. We can anticipate these by making 
the correction in the first place instead of 8. (See also Note 9-16a.) 
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For a wholly internal block (i.e. one entirely surrounded by adjustable values) the 
effect may be more serious. If we simply apply the correction 8 given by (15) and adjust 
values at points adjacent to the block twice, the corrections at these may approach $8. 
Hence for such a block it is usually worth while to multiply by a factor greater than 
f before applying them; § is generally found satisfactory. 

The normal procedure would be to tabulate the suggested values of 8 at each stage, 
using either (3), (4), or (7). The larger values would be multiplied by f and applied. 
This process will be varied in two ways. If it is noticed that a circuit of values at 
internal points need corrections nearly all of the same sign, a block correction will 
be evaluated for points within and on this circuit. If the problem indicates a singu¬ 
larity such that the solution near it is not of the form (1), it will be best to examine 
specially what the form will be, and to devise a method of approximation near the 
singularity that will be adapted to this form, as for (8). 

A block may contain an internal block and it may be convenient to adjust both together. 
If the whole block needs a correction 8, and the treatment of the inner block indicates one 
of 8 ', the latter is really relative to the outer parts of the main block, and the whole correction 
needed by the inner block is £ + 5'. 


9*181. As an example we consider a condenser consisting of two concentric long square 
prisms of sides 2 and 4, similarly situated. The inner is at potential 1, the outer at potential 
0 . Find the distribution of potential between them. Evidently the region consists of eight 
similar pieces and we need consider only one of them. As a first approximation we take 
the values at c, d by direct interpolation according to 9-18 (3), giving c = d = 0-50. For b 
we have (using points at 45° to the axes) 


4>b = £(1 + 9 + 0+<f> b )] <f> b — 0*33, 
and for a from (7) 

<£« = f.i(2& + 0)+i.i(l + 0) = 0-13 + 0-05 = 0-18. 
A second approximation to (j) b is now, from (7), 


06 = i(! +0a + 0c+°)+¥ff(° + 0& + 1+0 )> 



whence = 0-41. Then^ c and are corrected to 0-48 and 0-49. 

At this point it becomes worth while to retain an extra 
figure. We continue to use (7) and get 


<f) a = 0-214, = 0-409, <f> e = 0-480, <f> d = 0-492. 

This is as close as we need attempt without reducing the size of the meshes, since the values 
of<f> a and <p b may be affected by the proximity of the corner. We therefore halve the size 
of the meshes; we interpolate to the centres of the coarse meshes by (3) for diagonal 
elements, and then fill up by (3) for adjacent elements. But for the three values nearest 
to the inner corner we also use (12); the results by this method are 0-706, 0-643, 0-475, as 
against 0-722, 0-660 and 0-508 by linear interpolation. The former are the better because 
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they have taken the peculiar behaviour of the function in this region into account. The 
values are shown in the diagram. 

o 

o 

0 

0 
0 
0 

0 

0 

■0 

744 492 244 

We take blocks as indicated by dotted lines. For the upper block the correction indicated 
is given by 

— 165 = 2(197 + 156+106 + 54) —2(25 + 49 + 234+153) 

8 = - 6 - 8 ; 






7 

190 

149 

^99/ 

47 

402 

315/ 

207 

99 

/ 

470 

315 

149 


621 

402 

190 


703 

453 

217 


731 

475 

231 


738 

485 

238 


739 487 239 


but we allow partially for correlation with values at adjacent points by multiplying by 
and therefore take 8 = — 7. 

Apply this correction and work out 8 for the lower block; it is — 4-1, which we similarly 
change to —5. The revised distribution, after the values near the corner have been 
readjusted, is as in the second diagram above. 

We now work out corrections by (7) for all elements; the largest indicated are the ad¬ 
jacent ones where the present values are 0*402 and 0*315, reaching — 9 and — 8 in the third 
figure. We apply corrections —14 and —12 and readjust all entries in turn. The result is 
shown in the next diagram. 


— 

— 

~ 

7 

188 

149 

KH/ 

5° j 

1 

388 

30/ 

/a 

206 

101J 

1 

617/ 

465 j 

I 

1 

303 

149 | 

1 

r 

617 { 

388 

188 j 

i 

i 

i 

i 

700 • 

446 

216 

i 

i 

i 

729 

475 

232 

i 

i 

_L_ 

738 

484 

239 



741 486 241 


740 484 239 
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9*181 


A potential problem 

A block correction gives 8 = — 2 for the upper block. For the lower we use a double 
block for the sake of symmetry (it would have been better to do this at the previous block 
adjustment) and get 8 = — 3. These corrections are applied; in the next adjustment the 
largest correction needed is — 0*004, and the solution is as in the last diagram. It should 
be right to about 0*002. 

To get the capacity, apply Gauss’s theorem to the square with the same centre and side 
1*5. Using centred first differences we have the following values of — ^d<fijdx. 


By numerical integration 


y 

-tf<f>ldx 

1*5 

0*200 

1*25 

0*314 

1*0 

0*428 

0*75 

0*484 

0*50 

0*496 

0*25 

0*500 

0*00 

0*501 


A 

0*114 

0114 

0050 

0*012 

0*004 

0*001 



0*645, 


and the charge on the inner prism, per unit length, is 


~x 8x2x0*645 = 0*821, 
4:77 


which is the capacity per unit length. 

An analytical solution of the problem has been given by F. Bowman.* His result is 
0*8144. D. C. Gilles, by a relaxation method, has got 0*832.f For comparison, the capacity 
per unit length of a condenser formed of two circular cylinders of radii 1 and 2 is 

1/2 log 2 = 0*721. 

This adaptation of the method of finite differences to the solution of partial differential 
equations is due to L. F. Richardson, and successive approximation is always valid if the 
solution corresponds to making an integral a minimum. Extensions to many other types 
of differential equations, especially 

V 2 0 = w(x, y), VV = y) 

have been given by Richardson % and by R. V. Southwell § and his collaborators. A 
valuable introduction is given by L. Fox. || A method of adjusting the values near a comer, 
somewhat similar to that used in the above example, is due to H. Motz.^f The method used 
above combines features of the methods of Richardson, Southwell and Motz. In a recent 
paper Bickley** gives approximations to V 2 $6 and V 4 ^ taking account of higher differences. 
One triumph of Southwell’s methods is the calculation of the form of a waterfall. 

* Proc. Lond. Math. &oc. (2)39, 1935, 211-215; 41, 1936, 271-7. 

f Proc. Roy. Soc. A, 193, 1948, 428. 

j Phil. Trans. A, 210, 1910, 307-357; 226, 1927, 299^361 (with J. A. Gaunt). 

§ Relaxation Methods in Engineering Science , 1940; Relaxation Methods in Theoretical Physics , 
1946. 

|| Quart. J . Mech. Appl. Math. 1, 1948, 253-280. 

*[f Quart. Appl. Math. 4, 1937, 371-7. 

** Quart. J . Mech. Appl. Math. 1, 1948, 35-42. 
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EXAMPLES 


Practice in numerical work is the only way of learning it. The student should begin by taking 
selected entries from standard tables of mathematical functions and applying the methods described 
in this chapter. 

1. If g(pc) is a polynomial of degree n— 1 equal to f(x) at x lt x 2 . x n , show that 


g(x) 1 * 
f( x l) 1 x t 


x i 


= 0, 


f( x n) 1 x n ••• x n 

a.nH that Lagrange’s and Newton’s interpolation formulae arise from two different ways of expanding 
this determinant. 


2. Find the general solution of the difference equation 

A z y n = A a y n +12A y n . 

8. Show that when u n is a polynomial of degree N in n 

Zu n x" = -?- + --L - A"m 0 

0 1 — * 1 — a; i \1 — x) 

00 ^3 I /va 

if the series is absolutely convergent. Hence evaluate 2 

4. Find the real roots of the equation 
to three significant figures. 

5. Using 


(1.0. 1940.) 


! 3” 

(x—3) (a; 8 — 1) = 1 


(I.C. 1937.) 


(I.C. 1943.) 


r» 


J logxdx = wlogn — n+ 1, 

apply the Euler-Maclaurin formula and show that for integral » 

i?2 B t 


log n! = C + (n+£)logn —n— 


1.2 n 3.4n 3 


6. Estimate 

7. Prove that 


»=in 2 + 5 2 * 


r -j-l/t 

f(x) dx = /o +/i + • • • +/n + rt+% “ j) — 6 7 6o(^ S /»+Va — ^ 3 /-V») + ■ 


/•% 

and check the formula by integrating I x 4 d*. 

J 


8. T akin g logarithms of 


2n+l = f A 


2n +1 


2n — 1 

and naing Richardson’s method for n = 2 and 3, derive a value for Euler’s constant. 

dy 


(0-5780.) 


9. A solution of the equation 


dx 


= 3 x 2 + y* 


passes through (0,0). Tabulate its values, correct to three decimal places, at intervals 0-1, over the 
range 0*S*^ 1. (I.C. 1936.) 

10. Illustrate the method of relaxation by finding the values of x x , x 2 , x 3 that make the following 
function a minimum: 

V = 10«|+ 15x\ + 20xl + x 1 x 2 + 2x 1 x a -x 1 - 2-5x 3 -x t . (I.C. 1939.) 





Chapter 10 

CALCULUS OF VARIATIONS 


When change itself can give no more 
’Tis easy to be true. 

sib chaklbs sedley, Reasons for Constant# 


10*01. Condition for an integral to be stationary. Suppose that we have an 
integral of the form 

s= S‘i/& x ’ t ) dt ' (1) 

where/is a given function; x is to be a function of t, but we have not yet specified what 
function. The problem of the calculus of variations is to decide what function x must be 
in order that 8 may be stationary for small variations of x. In its simplest form we can 
consider the determination of the shortest distance between two points. Using Cartesian 
coordinates and assuming that y is a differentiable function of x, with an integrable 
derivative, the distance along an arbitrary path is 

-a»’*4v 

If the ends are specified, so that y(x x ) = y v y(x 0 ) = y 0 , two given quantities, we know 
that 8 is made a minimum by taking 

y-Vo = x ~*o (3) 

Vi-Vo ' ' 

This makes the path the straight line connecting (x Q , y 0 ) and {x x , y x ). If we make y any 
other function of x we are choosing a different path, and its length will necessarily be- 
greater than that of the straight line if the termini are kept the same. The characteristic 
feature of the calculus of variations, in contrast to ordinary problems of maxima and 
minima, is the occurrence of the unknown function or its derivative under the integral 
sign. To evaluate the integral (1) we must have the value of x for every value of t in the 
range; to make it stationary we have therefore, effectively, to determine an infinite 
number of values of x. 

Let us consider two slightly different functions x and x\ and write x' — x — dx, which 
we call the variation of x. The corresponding variation of dx/dt is 



.dx dx' dxd- 
8 dt~~dt~dt~dt^ x) ‘ 

(4) 

Then 


(5) 


We can write dxjdt = p and regard / as a function of the three variables p, x , f; then to 
the first order 

(S* + §H* < e > 
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10*01 Calculus of variations 

No term in St is needed; for Sx is defined as a variation of x for given t, and the integral 
can be regarded as the limit of a sum over the same ranges of t whether we are considering 
x or x' . Thus we are varying x and p without varying t. Now 


and therefore 



(7) 

( 8 ) 


Here Sp no longer appears, and we have to say what conditions are implied by the require¬ 
ment that 8S — 0 to the first order for all admissible forms of Sx. 

If Sx was completely arbitrary it would follow at once that dfjdp = 0 at both limits 
and that 



dtf 

dtdp 


^ = 0 


( 9 ) 


at all intermediate values. For if dfjdp =}= 0 at t = t 0 we could take Sx^Oatt = t 0 and zero 
everywhere else, and then SS would not vanish to the first order. If there is any inter¬ 
mediate range of values of t such that is positive at all points within it we could take 
Sx = 0 outside this range and positive within it, and again SS would not vanish to the 
first order. This argument is to be found in some text-books, but is not quite complete, 
for Sx is not completely arbitrary. The existence of Sp throughout the range implies that 
Sx is differentiable, and consequently we cannot take it different from zero at t 0 and zero 
everywhere else. But if, for instance, dfjdp 4 s 0 at t — t 0 we could take 


// <r\2 

Sx = CC J- -rjj (t 0 < i < r), 
(*0 ~ T ) 

= 0 


( 10 ) 


where r may be as small as we like. Then Sx is differentiable, and by taking r sufficiently 
small, keeping a constant, we can ensure that SS has the same sign as the integrated part. 
Again, if there is any range, say from t — a to t = b, where <j> is positive, we could take 

<fcr = 0 


Sx = a(t—a) 2 (b-t) 2 (a^t^b), ■ 
Sx — 0 (6<<<<i),. 


( 11 ) 


and Sx is differentiable; and with this form of Sx, SS would not vanish to the first order. 
Hence if SS is to vanish to the first order for all variations Sx that are differentiable with 
respect to t, dfjdp must vanish at both limits and 0 = 0 at all intermediate values, dfjdp 
will in general involve p; hence <f> = 0 is ordinarily a differential equation of the second 


order for x. 

This argument is applicable to a wide range of problems, but is not quite general. In 
writing (6) we have assumed that/has partial derivatives with regard to x andp throughout 

the range, and in deriving (7) we have assumed that ^ ^ exists. These conditions may not 
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be satisfied, and if they are not, totally different and much more difficult methods become 
necessary.* Fortunately in practice they usually are satisfied. The conditions of the 
problem may also include the condition that 8x = 0 at the termini. This happens in the 
simple problem of finding the shortest path between two given points. For if in (2) we 
are given y 0 and y x our data forbid us to vary y at the termini, and the admissible forms of 
8y are all such that 8y 0 = 8y x = 0; but then it does not follow that df/dp vanishes at the 
ends, and the two terminal conditions to be satisfied by the solution are no longer df/dp = 0 
but that y has to take the assigned values. It will be noticed that in both cases we get 
two terminal conditions, the normal number that can be satisfied by the solution of an 
equation of the second order. 


10*011. A very important case is where / does not contain 
we multiply the differential equation by p: 


t explicitly. In that case 


df dx ddf_ 
dx dt ^ dt dp 


But 


d [ d_/ df \ _ 0/ dp d (df\ 

dt dx dt dp dt * dt ^ dp) ~ dp dt dt \0p/ * 


since df/dt = 0. Therefore 


U p £~ f ) 


ddj_ 
^ dt dp 


d f<^_ 

dx dt " ’ 


and a first integral of the differential equation is 


df 

Pfp~J — constant. 


10*012. This case is exemplified by (2); writing 

p = dy/dx, f = (p 2 + l)% 


we have 


df_ _ p 2 

^ dp •' (p 2 +l) 1 /a 


(p2+l) 1 /2 = _. 


(^+ 1 ) 1 / 3 * 

Hence p is constant along the path, which is therefore a straight line. 


( 12 ) 

(13) 

(14) 


(15) 


10*013. A slightly more complicated problem is that of the brachistochrone , first 
propounded by John Bernoulli. Let A and B be two points connected by a smooth wire, 
A being higher than B. A bead free to slide on the wire is released from A ; what must be the 
form of the wire in order that the bead may take the shortest possible time to reach B ? 

Take A as origin and the axis of y downwards. Then the velocity of the bead when at 
depth y is *J(2gy) and the time taken for x to reach a given value X is, with dy/dx = p, 


T =[ — s — = P 

JV( 2 ^) Jo 


(p 2 +1) 1/2 


*J( 2 gy) 


-dx. 


For this to be stationary for variations of the path, with the ends fixed, we have the 
first integral 

p 2 (P 2 + l) 1/a 1 

(p‘+i)' h J(2g !/ ) W+r^my ) = oonstant > 


* For illustrations of the failure of the present method, see Courant and Robbins, What is Mathe¬ 
maticsf 
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and therefore 


Variation of the limits 
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This is integrated by the substitution y = a sin 2 0 (so that 6 = 0 when the bead is at A), 


and gives 


x = j}+a,{6—\ sin 20). 


Since x is taken 0 at A, fi = 0. The path is therefore a cycloidal arc with A as a cusp. If 
the values of x and y&tB are given we have only to choose a so that the path goes through 
B and we have the solution. 

This answers the question if B is given. But suppose that we are given only that x has 
a given value at B, not the value of y there. Then we need the further condition at B 
denoted above by dfjdp = 0, which gives that p = 0 at B. Thus if we are told only that 
the lower ter min us is in a given vertical line, the cycloid required is the one that cuts 
that line horizontally. 


10*014. Sufficiency of conditions; maxima and minima. We have obtained 
necessary conditions for the integral to be stationary; in the cases just considered it is 
clear that they are also sufficient. To decide whether the choice makes the integral a 
minimum or a maximum , or merely stationary without being a maximum or a minimum 
for all possible variations, requires that account should be taken of the squares of the 
variations. In the problem of the shortest distance between two points this is simple. 
We can take the line joining the points as the axis of x; then 

flar* ■!**« 

and the length of the path chosen is a minimum. 

10*015. Variation of the limits. In obtaining (8) we have taken the limits t 0 , t x as 
given. If they also are subject to variations A t 0 , A t x , 8 will be increased by [/Af] /Wt at 
the upper limit and decreased by (/A£) <= ^ at the lower. The effect of allowing variation 
of the limits is therefore to change the integrated part to 

< i6 > 

In this expression, however, 8x is the variation of x for given t, and therefore we must 
not replace by t x + in calculating 8x x . If the varied x at t x + Atf* is x x + Ax x we have 

Ax x = 8x x +p x At x , (17) 


and the integrated part can be written 

Take, for example, the problem of finding the shortest distance from a given point 
(a, b) to a given line x cos <x + y sin a = 1. As before, the path must be a straight line. But 
at the intersection the possible variations of y entail corresponding variations of x, since 
A y x = — cot ccAx x . Then (18) is 

and vanishes if p = tan a. 
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Notice that since Sy 1 is the variation of y 1 with x x kept constant we could conveniently 
write it as (8y x ) x , in accordance with a convention used in thermodynamics; and then 
A y x could be written (^y 1 ) XlC o8*+ Vl 8in*’ If is curious that in spite of the obvious need 
in partial differentiation for precise statement of what is being kept constant, such 
statement is not embodied in the customary notation of pure mathematics; though it is 
provided in thermodynamics, the theory of partial correlation in statistics, and in 
probability theory.* 

10*02. Several dependent variables. The extension to the case where there are 
several dependent variables is quite straightforward. If the independent variables are 
<?i, tfe* •••><?» and we denote their derivatives with regard to t by q r , the variation of 

S = 7(?i(1) 

for small variations of the functions of t chosen for the q’& is 


where, if At x * 0, (A q r ) x = (q r + Sq r ) h +A tl ~ (q r ) tl > 


( 2 ) 

(») 


and similarly if A t Q 4= 0. It follows that if the variations 8q r can be chosen independently 
of one another the conditions for S to be stationary are 


V 

Hr 



(r = 1,2,...,»), 


( 4 ) 




(5) 


the latter condition holding at each limit. If/does not involve t explicitly there will be a 
first integral as in 10*011 


If At 0 = A t x = 0, 
at the limits. 


f— 2 q r ~4- = constant. 

r 

8q r df/dq r = 0 


( 6 ) 

(7) 


10*03. Most physical applications of the calculus of variations fall under three types. 
(1) Determination of conditions of equilibrium from the condition that the potential 
energy must be stationary. (2) Fermat’s principle in wave transmission, that the path is 
such that the time of transmission is stationary for small variations of the path. (3) Hamil¬ 
ton’s principle in dynamics. 

10*04. Fermat’s principle. The examples that we have already considered can be 
used to illustrate Fermat’s principle. If the velocity of a wave is the same at all points of 
the medium, the time of travel is proportional to the distance along the ray, and there¬ 
fore is stationary if the ray is straight. If the velocity is proportional to z 1/a , where z is the 
distance from some fixed plane, the time of travel is proportional to J ds/*Jz, and making 


* Cf. Yule, J. Roy. Statist. Soc. 99, 1936, 770-1. 
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Fermat's principle 

this stationary involves precisely the same analysis as the brachistochrone problem; 
the rays will therefore be cycloids with their cusps on z = 0. This example looks artificial, 
but actually seems to fit the propagation of explosion waves in clay. 

10*041. Another interesting case is that of wave propagation when the velocity is 
proportional to the distance from a fixed plane. This arises in the seismic survey of the 
earth’s outer layers; an explosion is made near the surface, and the times of arrival of the 
waves are recorded over a range of distance. 

The velocity is c(z 0 + z), where c and z 0 are constant; then the time of transmission from * 
(0,0) to ( x x , z x ) is 

T = t Xl ( l +P % ? hdx 
Jo C(z 0 + z) ’ 

with p = dzjdx taken along the ray. x does not occur in the integrand; hence a first 
integral is 

(1 +p 2 ) 1/is p 2 A 4 

, - —- 5777 —- -r = constant, 

c(z 0 + z) (1 + p 2 ) k c(z 0 + z) 

that is, (1 +p 2 ) 1/a (z 0 + z) = constant. 

Let the ray begin at an angle e to the surface, so that at z = 0 , p — tan e, and 

(l+p 2 ) 1 /a(z 0 + z) = z 0 sece. 


Then 


x l 


"lo 


(z 0 + z)dz 




sec 2 e — (z 0 + z ) 2 


“I* 

yh 


; sec 2 e-(z 0 + z) 2 } 1/a 
= z 0 tan e— *J{z% sec 2 e—(z 0 + z x ) 2 }, 
and the ray is part of the circle 

( 1 x — z 0 tan c ) 2 + (z + z 0 ) 2 = z% sec 2 e. 

The deepest point of the ray is*at x — z 0 tane, z = z 0 (sece— 1 ). The time of travel to this 
point is 


j, 


e.tane z se ce _ 
dx 


z 0 sece 


c(z 0 + z ) 2 


0 ( tane 

0 c{z§ sec 2 e — (x — z 0 tan e) 2 } 

. . ,x — z 0 ta,ne~] Stta;ae 

tanh -1 --- 

cL 2 0 sec e 


-in 


eT 

_c 


dx 


tanh -1 sine. 


The ray is refracted symmetrically up to the surface again; if X, T are the horizontal 
distance traversed and the time taken when it again reaches the surface 


2 

X = 2 z« tan e, T = - tanh -1 sin e. 

c 

This gives in terms of the parameter e the relation between distance and time of trans¬ 
mission between points on the surface. 

10*05. Restricted variation: catenary. The admissible variations may be connected 
by some condition that makes them not independent. Consider, for instance, a uniform 
chain hanging from two fixed points; the position is one of minimum potential energy 
under gravity, and therefore if y is the height at any point jyds is stationary. But the 
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length of the chain is fixed; hence we can vary y only in such ways that 8 jds = 0. Then 
if A is any constant the variations will satisfy 


= 8j{y-k){\ + y' z ) lk dx = 0, 


and conversely if we can find a A such that this is satisfied for all variations of y, then 
jyds is stationary for all variations that do not alter jds. The condition required has the 
first integral 


V(l + y /2 j (y ~ A) = (2' _A H 1+ ^ 2 ) 1/a+c » 


that is, 


y~ A , 

(l+y'*)'* ’ 


_ C_cdy__ _ . , 2 ^* 

. .x-a 
y = A + ccosh-. 


We are given the values of y for two fixed values of x , and also the length of the chain. 
These three data suffice to determine the three constants A, c, a. 

10*06. Hamilton’8 principle. Consider a system of n particles, a typical particle 
having mass m r and coordinates x n . The components of force acting on it are X H . Then the 
equations of motion are, for each r, 

m r x ri = X ri (r = 1,2, i — 1,2,3). (1) 

Multiply these by a set of small vectorial displacements Sx H , which are arbitrary functions 
of the time, and add; then we have (summation with respect to r being explicit) 

X, m r x ri 8x h = 2 X ri 8x ri (2) 


This equation is completely equivalent to the equations of motion; for the Sx H are com¬ 
pletely arbitrary and we may therefore equate all their coefficients and recover the 
original equations. Now integrate between two given times; we have 



f X l m r x ri 8x ri dt = f 

Jt, r Jt, 

% X H 8x H dt. 

(3) 

Now 

J m r x ri 8x ri dt = 

f 41 d 

t “J t m r*riJ t 8x ri dt 



= *rifari 

ti rti 

— m^^x^dt, 
u Jt, 

(4) 

and therefore (3) is the same as 



\xm r x ri 8x ri -f f 'ZX ri 8x ri dt. 

L r Jit, Jt, r Jt,r 

(5) 


If then the 8x ri are zero at the limits, 


dS = f {£(£ 2 m r x?i) + XX ri 8x ri } dt = 0. 

Jt, r r 


( 6 ) 
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This is the most general form of Hamilton’s principle in classical dynamics. The function 
is the kinetic energy, T. If there is a work function W, a function of the x H , 

r 

and possibly of t, but not of x H , such that 


y _ dW 

ri ~ o.. » 


dx. 


H 


( 7 ) 


2 — SW, the variation of W when the coordinates are varied by dx ri ; and (6) 

r 

becomes . ti 

= (T+W)dt = 0. (8) 

This is the form taken by the principle if the system is conservative. 

The expressions in (6) and (8) are scalars, so that the device of introducing the variations 
has enabled us, for n particles, to summarize 3 n equations of motion in one scalar equation. 


10*07. Generalized coordinates and Lagrange’s equations. Now consider 3n 
functions q 8 of the coordinates, such that, if they are known, all the coordinates are 
determinate. We can then write 


x ri ~~ x ri($l 7sn)> 


(9) 


and call the q 8 generalized coordinates. Then, if we use the summation convention with 
regard to s, 

Sir dX « ~ dX fiA d± ri _ ^ X H 

Sx « = w. sq -' x ' i = W. q ” W.~W.’ 


2T - 

2 Xri ^ x ri = 2 Xri ’JTT 
r r °q s 


and SS can be put into the form 


ts~ j‘\ST + Q,Sq,)di, 

where T and Q s are now given functions of the q 8 and q s . Then 

rdT * T 1 C^/dT d dT\ . _ 


( 10 ) 

(11) 

( 12 ) 

(13) 

(14) 


and the condition that 88 — 
coefficients, 


0 to the first order for all differentiable 8q t gives, on equating 


ddT^dT 

dtdq 8 dq t ~ ^ 


(15) 


These are Lagrange's equations. They are usually obtained in text-books on dynamics 
by direct transformation of (2); but the derivation from Hamilton’s principle explains 
also why the left side has the characteristic form of the calculus of variations. 

Now it may happen that in the actual motion certain relations between the x H , and 
therefore between the q s , are specified. The most important case is where many particles 
belong to the same rigid body, and the coordinates can vary only in such ways that dis¬ 
tances between particles of the same body remain unaltered. Another is when some 


JMP 


21 
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coordinate is constrained by external force to vary in some prescribed way with time, as 
when a part of the system is made to move with given linear or angular velocity. Such 
constraints do not prevent us from considering variations 8q a such that the constraints 
are violated, and we can therefore still treat all the dq 8 as independent and equate their 
respective coefficients. Then (15) remain true. But their physical interpretation is altered. 
Whereas in a system of free particles they are differential equations for the separate 
coordinates, in a system with constraints some of the q a in the actual motion will be deter¬ 
minate functions of the others and of the time, and the corresponding Q s will be the 
reactions needed to keep the constraints satisfied. It then becomes convenient to use one 
set of q a just sufficient in number for them all to be varied without violating the con¬ 
straints; and then Lagrange’s equations will hold for this set, and the other q a need not 
be considered unless we want to know the corresponding reactions explicitly. But some 
of the latter set may be given functions of the time, and in that case the x ri will depend 
on the time as well as on the unconstrained q a , and the time may appear explicitly in the 
kinetic energy. This does not affect the form of (15), but it will affect the first integrals. 

For a rigid body we need six coordinates to say exactly where all its particles are. 
D’Alembert’s principle follows at once if the body is regarded as made up of particles such 
that the force between any pair is along the line joining them. For the two forces of any 
pair add up to 0 and so do their moments about any axis. Also if x { , y i are the coordinates 
of two particles r apart and X[ the force on the first due to the second, with resultant B' f 
then the contribution from their reactions to 2 X ri 8x ri is 

r r 

= f ( x i~yi)( s y<~ Sx <) = 

which equals 0 if the variations are such that the distance between the particles is un¬ 
altered. Even if the reactions between a pair of particles are not along the line joining 
them, so long as the internal reactions have a work function depending only on the mutual 
distances of the particles, without necessarily being separable into terms each depending 
only on the distance between two of them, it will follow that they contribute nothing to 
2 X ri 8x ri whenever the 8x H are such that they do not alter the mutual distances. The 

r 

generalization of such a sum for an elastic solid would be minus the change of elastic strain 
energy, which vanishes if the distances between particles are unaltered. Without some 
equivalent supposition there seems to be no reason why d’Alembert’s principle should 
be true, but in any case it is really an approximation since all real solids have some 
elasticity. 

10*071. Non-holonomic systems. It sometimes happens that some linear relation connects 
the velocities but is not integrable, so that it is impossible to use it to eliminate one coordinate in 
favour of the others and leave the variations independent. This happens particularly in problems of 
rolling spheres and disks in three dimensions. Such systems are called non-holonomic . The method 
can still be adapted to their treatment by the use of undetermined multipliers. For simplicity let 
us suppose that there is only one such constraint, of the form 

a,q, = 0, (1) 

where the o 4 will involve the q t . We consider variations such that 

a t Sq, = 0. 


( 2 ) 
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_ , (d8T 8T ^ \ . 

We have = 0, (3) 

provided the 8q, are such that the reactions do no work. This implies, in cases of rolling, that the 8q, 
are such that they involve no slipping; if there was slipping the tangential force would do work equal 
to its amount times the amount of slip. Hence the condition that (3) may be true, with unaltered Q t , 
is simply (2). We do not asstime that the varied path itself satisfies the constraints, and in general it 
does not; but we do assume that the Sq, do. Then for any A 


/ d8T 
\df 8q, 


8T 

8q $ 


Q, — Xa a 



= 0 . 


(4) 


Choose a particular q„ say q x , where a x =(= 0, and suppose A chosen so that the coefficient of 8q x in (4) 
vanishes; that is. 


d 8T 8T 

dt84 x ~8q x ~ Ql ~ ai 


0 . 


( 6 ) 


Then we can assign all the other 8q, arbitrarily, since with any choice of their values 8q x is determined 
by (3), and contributes nothing to (4) on account of (5). Then since (4) is true for all 8q, (s = 2,3,...), 
we can equate coefficients and get 


d 8T 8T 

dtW, a ' 


0 (s+1). 


( 6 ) 


If there are m coordinates q, these equations with (5) give m differential equations involving the 
coordinates and also the unknown A; but we have also (1), and the equations are in general soluble. 

In spite of the apparent simplicity and generality of the method of undetermined multipliers it is 
hardly ever used for concrete problems of non-holonomic systems. We see that the sum Aa s 8q, is added 
to Q,8q,, and therefore is the work done by the reactions in an arbitrary displacement not satisfying 
(2). If the multiplier is chosen so that a s q, is the velocity of slip, a,8q t is the amount of slip in an 
arbitrary displacement, and — A is therefore the reaction resisting slipping. The method therefore does 
not avoid the explicit introduction of the reactions, but merely gives another way of determining 
them. It does require the explicit statement of all the coordinates, which the moving axes method 
often avoids. For a rolling sphere, for instance, the method of moving axes need not concern itself at 
all with the absolute position of any axis fixed in the sphere; it states the equations of motion directly 
in terms of angular velocities with respect to axes conveniently chosen with respect to the surface 
that the sphere is rolling on. The method of undetermined multipliers requires the introduction of 
three Eulerian angles and their subsequent elimination, since their actual behaviour is usually of 
negligible interest. In the most complicated problem of rolling known to us, Whipple’s treatment of 
the stability of a bicycle,* Appell’s equations were used in preference to either method usually taught. 


10*072. First integrals. For a conservative system, there is a work function W 
depending only on the q s such that dW/dq 8 = Q s , and if T and W do not involve the time 
explicitly, we have the usual first integral 10-02 (6), namely. 


q 8 ^r — T —W = constant. (1) 

This is the energy integral T—W — constant if T is quadratic in the q 8 . But a similar 
integral can exist even if work has to be done from outside to maintain the constraints, 
provided that the other forces have a work function, which we shall still denote by W. 
In this case some of the coordinates are given functions of the time, and T may not be a 
homogeneous quadratic in the unknown q 8 , since x ri depends partly on the prescribed 
velocities. But if T and W do not involve t explicitly the integral (1) will still exist. If 

T = T 2 (q 8 ) + T x {q 8 ) + T 0 (q a ), 

* Q. J. Pure and Appl. Math. 30, 1899, 312—48. 


( 2 ) 
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where T v T v T 0 are homogeneous of degrees 2,1,0 in the unconstrained q 8 , but may involve 


q $ but not t, we have 


dT 

4. W = 2% + T„ 


(3) 


and therefore the energy integral is replaced by 

T^ — Tq—W = constant. (4) 

This integral is often useful. Consider a circular wire made to rotate with given angular 
velocity (t) about a vertical diameter. A bead is free to slide on the wire. Then the kinetic 

energy is _ ^ma 2 {6 2 + sin 2 0(o 2 ) + |/o> 2 , (6) 


and the work f un ction is mga cos 0. If then o» is constant T does not involve the time 
explicitly and we can write down at once the first integral 

\ma\& 2 — sin 2 6a> 2 ) — \Id 2 — mga cos 0 — constant. (6) 

The term \Id 2 is itself constant and therefore irrelevant. The function on the left is not 
the energy, which is T— W and is not constant, but varies on account of the work that 
has to be done by the constr ain t to keep <o constant while 6 varies. In fact if A is the couple 
needed we have for the rate of performance of work by it 


No) = y (T — W) = ~ {\ma\6 2 + sin 2 6d 2 ) + \I(o 2 — mga cos 0} 

Cut Cut 

— (ma 2 o) 2 sin 2 6 + Id 2 ) by (6) 

Cut 

= 2ma 2 G> 2 sin0 cos 66. (7) 

Hence N = ?na 2 ^(sin 2 0w), (8) 

which is the couple needed to maintain the angular velocity of the particle, since the 
an gnlar momentum of the particle about the vertical is ma 2 sin 2 6d and therefore varies 
with 6 when a) is kept constant. 

10*073. Lagrange’s equations for the top. We have 

2 T — A($ 2 + sin 2 0A a ) + (?(;£+A cos#) 2 , 


W — — Mgh cos d, 


whence, since A and ^ occur only through their derivatives, we write down at once two 
first integrals 


0 m 

— 7 = <?(;£+A cos 0) = Cn — constant. 


— = A sin 2 dX 4- Cn cos 6 = constant. 

0A 

The 6 equation is 

A (0 - sin 6 cos 6X 2 ) + Cn sin dX = Mgh sin 0. 


It is convenient to use the last equation rather than the energy equation in treating small 
oscillations about steady motion because 6 is of the first order in the amplitude. 
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If the axis is nearly vertical the angles A, X are measured about nearly the same axis. 
Hence in treating small oscillations about the vertical it is convenient to take A — rfr 
as a new coordinate instead of x> 80 that 

2T = A(d 2 + &in 2 0k 2 ) + C{ijr — A(l — cos#)} 2 . 

If l, m, v are direction cosines of the axis with regard to fixed axes, that of z being vertical, 

cos Q = v — 1 — ^(l 2 + m 2 ) + 0(0*), 

A(1 — cos 6) — 2A sin 2 \6 — £A sin 2 0 + 0(0*) = \(lm — ml) + 0(0*), 

0 2 + sin 2 OX 2 = l 2 +m 2 +v 2 = t 2 + rh 2 + 0(0*), 

Hence, exactly, \jr— A( 1 — cos 0) = n, 

and to order 0 2 2T — A (t 2 + m 2 ) + C{\jr — \(lm — mi)} 2 , 

2 W = Mgh(l 2 + m 2 ), 

whence, taking l and m as Lagrangian coordinates 

Al-\- Cnm = Mghl, Am — Cnl = Mghm, 
which we have discussed in 4*092. 

The device of taking the sum of two rotations about nearly coincident axes as a co¬ 
ordinate is used in this way in the theory of the motions of the planets; it makes the 
maximum use of the simplification introduced by the fact that the mutual inclinations 
of the orbits are small. 


10*08. The Hamilton-Jacobi equation. Suppose that a work function W exists 
and that the system is holonomic, and put T+W = L, the Lagrangian function. We have 


8 




ddL\. 

dtdq,) Sq ‘ 


dt, 


( 1 ) 


if the limits t 0 , t x are unaltered. But if t 0 , t x are also varied by A< 0 , At x and Aq s is the variation 
of q 8 to the new limits, the integrated part will become 


since Aq s = 8q a + q s At. 
We put 



. 3L r rr 
q ‘Ws~ L = H ' 


dL 

Ms 


= Pe- 


(2) 

(3) 

(4) 


Then H is called the Hamiltonian function and p 8 a generalized momentum; and 


Now if we suppose the q a given at times t 0 , t x , the corresponding p a will in general be 
determinate, since only one set of momenta at time t 0 will give the same set of displace¬ 
ments up to time t v Hence if $ is taken along a dynamical path it will be a definite func¬ 
tion of t 0 , fj, (q a ) 0 , and (q t ) v It is called Hamilton’s principal function. It is a function of 
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the q s at the actual t 0i t x however they are varied, and therefore we can replace ^ by t; 
then at time t, since the last integral in (5) is zero by Lagrange’s equations, 


dS TT dS 

Tt=~ H ' wr p ‘' 

but at time t, (f0 = (flk, (^) = - (?.)/.• 


(«) 

(7) 


Now (4) can be used to eliminate q g from H, and then His a function of q e , p 8 and possibly t. 
Then 


dS 

dt 




( 8 ) 


This is the Hamilton-Jacobi equation. It is a partial differential equation of the first 
order in the n+ \ variables, not involving 8 explicitly, and its complete integral will 
therefore contain n+1 adjustable constants. From our first point of view S was a function 
of t only, containing 2n+l adjustable constants, namely, t 0 and the initial coordinates 
and momenta; but if the initial and final coordinates and t—t 0 are given the initial 
momenta are determinate and therefore are not adjustable; n+ 1 is therefore the right 
number when S is expressed as a function of t and the q 8 . 

If L does not contain the time explicitly, H = constant is the energy integral; if we 
denote this constant by h we have 



(9) 


and therefore S =-h(t-1 0 ) +f(q s ,q s0 ). (10) 

Again, dS/dq g0 is simply —p s0 , which is independent of t\ and thus we have n equations 
expressing that the dSjdq 80 , which are functions of the q s , q s0 , and possibly t, are constant 
throughout the motion and equal to —p g0 . Hence, given 8, we have n equations to deter¬ 
mine the q 8 in terms of t and the initial conditions; the whole solution of the problem is 
therefore reduced to manipulation if we can determine 8. This result is due to Hamilton. 
The difficulty in using it as it stands is that, while it is often fairly easy to obtain a com¬ 
plete integral of (8) involving n+1 constants, it is not often easy to express these con¬ 
stants in terms of q s0 ; they are usually functions of both the q s0 and the p gQ . The theorem 
was completed by Jacobi, who showed that any complete integral of (8) can be used in 
exactly the same way. Before proving this, however, we need Hamilton’s form of the 
equations of motion. 


10*09. Hamilton’s equations. L is a function of q g , q 3 , and possibly t\H is a function 
of q B , p 8 , and possibly t. Then for arbitrary variations of q s and q s> without varying t, 

8H = 8(q s p s — L) = q s 8p s +p 8 8q g -Sq 8 - ~Sq 8 . (11) 

But by definition^ = dL/dq 8 ; and therefore 


8H = q 8 8p g -^8q a , 
But by Lagrange’s equations 

• -^ dL 
Pa ~ dtdq , 


. _m bh_ 

' 8 ~ZPs dq 8 - 

3L__oH 
tys ~~ d 4s’ 


dL 


( 12 ) 

( 13 ) 
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and therefore 




(14) 


These are Hamilton's equations. They are to be regarded as a set of 2 n differential equations 
of the first order, Lagrange’s being n of the second order. 

Hamilton’s equations can be directly related to a variational principle as follows. Take* 

B = ll(q„ p,))dt. (15) 

Then SB = ^Sq.+^Sp.-^Sq.-^Sp.) dt 

[ ~Vi c*'/ dH dH \ 

+ fys-fytysjdt. (16) 


The conditions that B shall be stationary when 8q 8 = 0 at t 0 and t x , for all variations 
8q a} 8p a at intermediate times, q 8 and p 8 being supposed to vary independently, are 


dH . 

v ’~ a?.’ q ’ 


dH 

d Pt ' 


(17) 


which are Hamilton’s equations. If they are satisfied and the limits also are varied, 

= [(A-#) M+p 8 8q^ = ^p 8 Aq 8 ~HAtj\ (18) 


which is precisely the same as 8S. 

The last argument does not prove Hamilton’s equations. For if we were to define p g as 
dL[dq a in the usual way there are relations between p 8 and q 8 , and the variations Sq 8 , 8p 8 
are not independent. Hence we cannot equate their coefficients to zero. On the other hand, 
if we do not use a preliminary definition of p 8 there is no particular reason why the integral 
should be stationary for variations of p e irrespective of q 8 . But we have seen that, given 
p e — dLfdq s , q B — dH/dp t is merely a matter of differential calculus. The dynamics is 
-contained in the other set of equations. Then in (16) the coefficients of the 8p s do vanish 
identically, and therefore those of 8q s can be equated to zero, leading to the other set of 
Hamilton’s equations. The remarkable point is that, though in fact the p 8 are originally 
defined in terms of the velocities, nevertheless if we choose to regard them as subject 
to independent variations, B is stationary subject to Hamilton’s 2 n equations. The 
variations are not independent, but B is stationary whether they are or not. 


10*10. Jacobi’s theorem. Let 

s “/tel.— >?»;<; a„)+a n+1 (19) 

be a complete integral of the Hamilton-Jacobi equation. In the original form the con¬ 
stants a x ...a n are the coordinates at time t 0 ; but Jacobi gives up this restriction and in 
place of dS/dq 80 — —p s0 takes 


dS 

da. 




dS 

dq 8 - Pa ' 


( 20 ) 

( 21 ) 


* G. H. Livens, Proc. R. S. Edirib. 39, 1919, 113-19. 
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where fi r is another constant. The theorem is that if we still take these as equations to 
determine q 8 and p 8 , the resulting q a , p 8 will satisfy Hamilton’s equations and therefore 
give a dynamical motion of the system. For, from (20), 


n _ ddS__ l d_ . d \ dS d*S . d 2 S 
~ dt da, r ~ \0£ + q ° dqj doc r da T dt + q * dct r dq a 

m . dp a 

~ 0a r + ^*0a r ’ 


( 22 ) 


But cl t enters into H only through the fact that the p 8 determined by (21) will contain Oy. 
Hence 


_ + n ffe = la 

dp 8 da r + q * dcc r V s dp J dcc r * 


(23) 


This is true for r = 0, 1... w; hence either 


or 


4a — dH/dp a (8 = 0,l,...,n) 

d(P!-Pn) _ 0 
0(a 1 ...a n ) 


(24) 

(25) 


In the latter case there would be a functional relation between the p 8 and the initial 
momenta could not be varied independently; hence (25) does not hold and therefore 
(24) do. 

Again, from (21), 


But 


dpa 

dt 


/a . J d JL\ l d M 

\dt qm dqjdq 8 ~ \ dqj „ \dpj Q \dqJ a ' 

( 1 °) tm + (m im , 

\dq a /c t \ dc LaJp \dPmJ q \dqJ a 


(26) 

(27) 


and therefore 


dp, 

dt 


-(© 


(28) 


Hence q 8 and p 8 found from (20) and (21) satisfy Hamilton’s equations. 


10*11. Transformation theory. Any transformation of the q 8 to a new set of n 
coordinates q' r with no functional relation connecting them will give a new way of stating 
the dynamical problem; Lagrange’s equations will hold for the q' r , and can be transformed 
to Hamilton’s form in exactly the same way. Such a transformation is called a point- 
transformation. There is, however, a more general type called a contact-transformation 
such that the q' r are defined as functions of both q 8 and p 8 , and nevertheless we can still 
define a set of p' r so that q' T and p' T satisfy equations of Hamilton’s form. Hamilton’s 
equations in q 8 , p 8 are equivalent to 

SB = ~ H ( q *’ Pa ' ^ dt = * < 29 ) 

We require also that there shall be a function H' such that 

$w = p'r, t)\dt = 


(30) 
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Take a function J(q a ,p' r , t) and suppose 

p 8 — dJ jdq 8 , q' r = dJ jdp' r , H' = H + dJjd1. 


(31) 


For small variations of q 8 , p 8 with t constant these equations can be solved to give small 
variations of q' r> p' r , and conversely, provided 

|| d>Jldq s dp' r \\*0. (32) 

Then 8(B'-B) = (p' r q' r -p s q 8 -H'+ H)dt 

- s[pM-*I &&+p.i,+-H) it 

= eipAr-r .I < 33 > 

by (31); then 

8(B'-B) = + = [P'M-Ps^sl ( 34 > 

by using (31) again. Hence (29) and (30) are equivalent. But Hamilton’s equations hold 
for q 8 , p a , H if and only if (29) is true to orders 8q s , 8p s for all small variations of the path. 
Hence (30) is true to orders 8q' r , 8p' r , and therefore q' r , p' r , H' also satisfy Hamilton’s equa¬ 
tions. 

Note that if J = q s p' 8 , we have q 8 — q 8 , p' s — p 8 ‘, and that if q r — P r > Pr = 9V> ?r an( ^ Pr 
satisfy Hamilton’s equations with H’ = —H. 

In particular let H = K + mK x , where K is a Hamiltonian such that the solution of the 
Hamilton-Jacobi equation is known, say S(q 8 , a r , t), and m is small. Define 

p;=a r , q' r = fl r = dSldx r , J = s. (35) 


/? r differs in 
shall have 


sign from that defined by (20). If K 1 is now expressed in terms of a r , fi r , t we 
a r = —mdKJdfl,., fi r = mdKfdoc,. (36) 


Let the integral of K x with regard to t as if a r , were constant be - S x {x r , fi r , t), and now 


take 


J x = x r (3 r + mS x {a' r , fi r , t). 


(37) 


The determinant in (32) is 1 + 0(m). 

Then fir = Pr + m i W> cc r = a' r + m dS x ldfi r , (38) 

R" = mK x {a r , fi r , t) - mK x (<x' r , fi r , t) = 0(w 2 ). (39) 

Thus the method is suited to rapid approximation. 

In celestial mechanics six parameters are needed to specify the coordinates of a planet 
as functions of the time; the a r , fi r can be used for this purpose, and are constant when 
there is no disturbance. When there are perturbations by other planets we can use (35) 
to specify a r , fi r in terms of the coordinates and velocities, and conversely; and then the 
method of transformation expresses the a r , fi r in terms of quantities as nearly constant 

as we like.* 

* The type of transformation (31) seems to have been introduced by W. F. Donkin, Phil. Trans. 
1855 dt>. 313-22. He seems to have also been the first to see that the transformations may contain 
t explicitly. The form (35), with extensions, is used by E. W. Brown and C. A. Shook, Planetary Theory , 

1933. 
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Hamilton’s equations are not much used in ordinary dynamics, because the first step 
in solving them would usually be to eliminate half the variables by differentiation and 
obtain equations of the second order. Their usefulness is in difficult problems. The Hamil¬ 
tonian equations are also fundamental in statistical mechanics and the function H itself 
plays an important but still imperfectly understood part in quantum mechanics. 


10*12. The principle of least action. We have from Hamilton’s equations if H 
does not involve t explicitly, 


dH .dH t .dH dH dH dHdH 

dt q *dq a Va dp a dp s dq 8 dq s dp a ~ 

Hence in any conservative holonomic system the Hamiltonian function does not vary 
with the time in any dynamical motion of the system. This follows also from 10*072 (1), 

fti 

since by definition H is the function there shown to be constant. Now we have S — Ldt, 

and if the times at the limi ts are varied ' 


Now take 
then 


“■[■“W'M** 

A — S + Htfi — to); 

“-[“*S**KGHS** 


(41) 

(42) 

(43) 


Now in deriving Lagrange’s equations from Hamilton’s principle we took fixed limits 
* 0 , h, but allowed the q s to vary quite arbitrarily. Thus the variations admitted permitted 
H to vary. But if we restrict ourselves to varied paths such that H is constant and equal 
to its value in the actual path, A H = 0, and if also A q a = 0 and Lagrange’s equations are 
satisfied, AA = 0. Since L = T+W, H = T—W, 


rti rti 

A = (L + H)dt = 2 Tdt. 

JU Jt, 

The function A is called the action, and the rule just given, that in the conditions specified 
A4 = 0, is the principle of least action * A is also called the characteristic function and can 
be made the basis of the transformation theory instead of the principal function. The 
principle of least action is equivalent to Hamilton’s principle, but is less convenient to use. 
When it is spoken about, Hamilton’s principle is usually intended. 


10*13. Routh’s modified Lagrangian function. In many dynamical problems 
some of the coordinates do not occur explicitly in L, only their rates of change occurring. 
Such coordinates are called ignorable, the others palpable. By a simple transformation it 
is possible to eliminate any or all of the former from the equations of motion. Let us 
keep the notation q 8 for the coordinates that we propose to keep, but <p a for the 
ignorable ones that we propose to eliminate. Let 

dLjdfig. = V<r . ( 1 ) 

* It can be shown that A is a minimum for a dynamical path if ti~t 0 is not too large; if — 1 0 
is large A may be stationary for small variations but neither a true minimum nor a maximum 
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Then by Lagrange’s equations is constant throughout the motion. Now form the 
fun°ti°n R _ (2) 

and eliminate <j> ff in favour of 7j a , not varying We have 


3 L dL ., 3 L ; 

iS = fy. Sq,+ W. Sg ’ + ~ ’• 


(3) 


and the last terms cancel. Hence when R is expressed in terms of q 8 , q 8 and rj a> and L in 
terms of q 8 , q a and <j> 0 

. (4) 

= 0. (5) 


ld_R\ _ ldL\ (dR\ = /3L\ 

WJ/ \3ffJ. \ 0 &V 


Therefore, by Lagrange’s equations, 


d dR 3 R 


dt 3 q 8 3 q a 


Routh’s transformation, apart from a sign, is similar to Hamilton’s, but is applied only 
to the ignorable coordinates (and not necessarily to all of them) instead of to all. Its 
applications are totally different, being mostly to small oscillations about steady motion. 
A steady motion may be defined as one such that the palpable coordinates are constant. 
It follows that in a steady motion dR/dq a = 0, and we can expand R to the second order 
in departures of the q 8 from their values in the steady motion and form linear differential 
equations for them exactly as for small oscillations about equilibrium. There is, however, 
one important difference. The elimination of <j> v in favour of Tj (r usually brings in terms 
of the form f(q r ) q a , and when we approximate there will be terms in R of the form g r8 q r q a . 
Now for particular values of r, a, 

d { 0 \ ^ 

dtW. {9r ’ lM rw. {Mri ’ )= g ” ir - <6) 

It {Wr i9r ’ q '^} ~ dq r = ~ g ” g ” <7> 

and these terms introduce terms in the velocities into the equations of motion. These 
are oalled gyroscopic terms. 


10*14. Variations of multiple integrals. The fundamental equations of many 
subjects are equivalent to statements that an integral is stationary for small variations 
of some function in it. The equations of static elasticity, for instance, can be expressed 
by a principle of stationary energy, the energy being a volume integral of a quadratic 
function of the strain components. In some problems the use of this principle is the 
nearest approach to a reliable way of getting the signs right. 

We take as an illustration the derivation of Schrodinger’s wave equation for a single 
particle from a variation principle. The Hamiltonian function is (apart from certain 


constant factors) 


H{Pi,Zi) = hpl + V, 


( 1 ) 


0 

and we replace p t by V is the potential energy, supposed to be a given function 

of x v Consider the integrals 




( 2 ) 
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through all space, subject to the conditions that ijr tends to zero at infinity at least aa 
rapidly as 1 /r, and that J is given. Then the condition for I to be stationary for small 
variations of \Jr is 

0 = * JJ/{^( - *V Y + Vf) - A^ 2 } dr (3> 

= JJJ{ - 8{\xjr^f) + 2Vijf8\Jr - dr. (4> 

But JJJ^(^V 2 ^)dr = + (5) 

and jjj(f^-S^)dr = <6) 

by Green’s theorem; and tends to 0 in the conditions stated. Hence 

0 = 2 jjjSrJr( - Vf- A^)dr, (7) 

and if this is true for all small variations 8\jr 


VV = 2 (V-X)t/r (8) 

everywhere. 

Further, we can take 8\jr proportional to \jr , and when xjr satisfies this differential 
equation A JJJ* ^dr = J/f ( - £V 2 ^+ Vf) fdr, (9) 


which determines A. It is a common practice to choose the constant factor in ^ so that 
the integral J is equal to 1. 


EXAMPLES 

a \y% r 

+11 dx between specified limits for * is stationary subject to J = I ydx having 

a given value, prove that the graph of y against x is an arc of a circle. 

If the terminal conditions are x — ± a, y = 0, what happens if the given value of J is greater than 

\lra? and I is interpreted as (1) an improper Riemann integral, (2) an integral j (dx i + dy 2 ) lh with 

J x——a 

an appropriate generalization of the definition of a Stieltjes integral? 

2. Find the curves in the (x,y) plane such that J*J(2E — n 2 y 2 )ds is stationary, where E and n are 
constants and the integral is taken between fixed end-points. 

Verify that these curves are the tracks of a particle of unit mass moving with energy E under the 
force (0, —n 2 y), taking the potential energy to be zero on the line y — 0. (M.T. 1944.) 


1. If/ 


-m 


3. If the velocity of waves in a sphere is c = a — fir 2 , where a and /? are constants, prove that 
the paths of stationary time are circles; and if a path enters the sphere at an angle e to the surface, 
find the polar coordinates of the deepest point of the path and the time taken to reach it. 

(Wiechert.) 

4. If \ds is stationary for variations of a path with fixed termini, where 

ds 2 = g ik dxidx k ( i , k— 1, 2, 3, 4), 


prove that 


— I dXk \ - 8gkm ° Xlt 8Xm 

ds y ds / dxf ds 8s ’ 


three of these equations being independent. 


(Riemann.) 


5. If in Example 4, 

ds 2 = c*(l- 2 ^) dt 2 - 1 dr 2 — r 2 (d0 2 + sin 8 ddA 2 ), 

find three first integrals of the equations of motion; and if a particle moves nearly in a circle in 
the plane 6 — find the apsidal angle. (Einstein.) 



Chapter 11 

FUNCTIONS OF A COMPLEX VARIABLE 

Of fowls after their kind, and of cattle after their kind, of every creeping thing of 
the earth after his kind, two of every sort shall come unto thee, to keep them alive. 

Genesis vi, 20 

11*01. Meaning and algebra of complex numbers. There are three chief reasons 
why complex functions, involving a symbol i such that i.i = — 1 , are of importance 
in physics, which involves only real quantities directly. The first is that many physical 
quantities are functions <j>, ft of two variables x and y, where <j> and ft are connected by 
the relations ^ g^ ^ g^. 

dx dy ’ dy dx 

Such pairs of functions occur, for instance, in two-dimensional problems of electrostatics, 
where <j> is the potential and ft the charge function; in two-dimensional hydrodynamics 
of an incompressible fluid, where <j> is the velocity potential and ^ the stream function; 
and in the closely analogous problem of flow of electric current in a uniform sheet. Then 
^ and ft are the real and imaginary parts of what is called an dTwXytic function w of the 
complex variable z = x + iy. The second is that the solutions of the differential equations 
of physics, for certain ranges of a real variable, are usually obtained as power series; but 
the same power series will equally well specify the values of a function of a complex 
variable, and the study of the complex values is often a great help towards obtaining 
more compact expressions for the real ones and relating expressions by power series valid 
in different ranges. The third is that many integrals given in real form are most easily 
evaluated by relating them to complex integrals and using the powerful method of 
contour integration based on Cauchy’s theorem. 

The important property of complex numbers is that they can be defined in such a way 
that they satisfy the fundamental rules of algebra 1*01 (1) to (9). We first consider the 
consequences of applying these rules to the real numbers together with a symbol i with 
the property i* = — 1. Since there is no real number with this property it is customary to 
speak of i as imaginary. If a and b are real numbers, c = a + ib is called a complex number, 
a its real part, and b (not ib) its imaginary part. We also use the notations 9H(c) = a, 
3(c) = 6 to denote the real and imaginary parts of c. 

First, if i 2 = — 1 and a, b are real, and a = ib, it follows that if the rules of algebra are 

obeyed by i, a 2 _ (ib) 2 = ibib = iibb = — 6 2 , 

And therefore a = b = 0. If a real quantity is equal to an imaginary one, both are zero. 

Next, if c = a + ib,c' = a' + ib ', where a, a', b, b' are real, the rules of algebra give 


— c = —a — ib, (1) 

c + c' = a + a'+ i(b + b'), (2) 

c-c' = c + (-c') = a-a'+ i(b-b') t (3) 

cc' = aa! — bb r + i(ab* + a'b), (4) 

ic — — 6 + ia. (5) 




334 Algebra of complex numbers 11*01 

By (3) if c — c' = 0, a = o' and b — b'. If two complex numbers are equal, their real parts 
are equal and their imaginary parts are equal. Hence by (2) and (4) the real and imaginary 
parts of the sum and product of two complex numbers are uniquely determined in terms 
of those of the two original numbers. 

We can, however, formulate these rules without the use of the symbol i, as an algebra 
of pairs of real numbers.* We think now of a pair (a, 6) as corresponding to a + ib, and to 
show the comparison of the notations we put 

y = ( a,b ); c = a + ib. 


Then a real number a corresponds to the pair (a, 0), and an imaginary number ib to the 
pair (0,6), and in particular i to (0,1). If we now define — y, y ± y', yy' by the rules 


-y = (-a, -6), 

(1) 

y + y' = (a + a',& + &'), 

(2') 

1 

II 

'S' 

1 

o- 

1 

q- 

(3') 

yy' — (aa' — bb',ab' + a'b), 

(4') 

i 2 = (0,1) (0,1) = (-1,0), 

(6) 

iy= (0,1) (a, 6) = (-6, a), 

(7) 

= (~a, —b) = — y; iy A = -y. 

(8) 


Thus the definitions of the components of ~y, y ± y', yy' and iy are identical with the 
rules for the real and imaginary parts of -c,c± c', cc', ic, and i 2 corresponds to — 1. We 
can henceforth use c instead of y, leaving it to be understood from the context whether 
we are speaking of the complex number or the pair of real numbers. 

These rules are consistent with the ordinary rules of algebra for addition and multi¬ 
plication of real quantities, which have been stated in Chapter 1. We have the com¬ 
mutative law of addition 

c + c' = c' + c, 

the associative law of addition 

c + (c' + c") = (c + c') + c", 


the commutative law of multiplication 
the associative law of multiplication 
and the distributive law 


CC - C Gy 

c(c'c') = (cc')c”, 
c(c' + c") = cc' + cc". 


* A similar idea of number pairs occurs in the theory of rational fractions. What we write as c = ajb 
can be written as a number pair c = (a, b), the rule for addition being taken to be 

(a, b) + (a b') = ( ab' + ba', bb') 

and the rule for multiplication (a, b) (a', b') = ( aabb'). 

l/(a, 6) is defined as (6, a), and (a, a) = (1, 1). It may be verified that with these definitions the 
laws still hold. 
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The truth of the first three and the fifth of these, when addition and multiplication are 
carried out according to the rules, is obvious. For the fourth we have 

c(c'c') = (a,b)(a'a"-bb',a'b' + a'b) 

= (i aa'a' - abb' - a'bb' - a'bb', ba'a" - bb'b' + aa'b' + aa'b'), 

(cc')c' = (aa'— bb',ab'+ a'b)(a',b") 

= {aa'a 0 - a'bb ' - abb' - a'bb', aa'b + a'a'b + aa'b' - bb'b'), 

and the explicit interpretations are identical. Hence complex numbers can be handled 
by algebraic methods just like real numbers. 

We see at once that ( c — c') + c' = c, so that subtraction is the inverse of addition, as 
in ordinary algebra. Division is a little more complicated; we write 

1 a — ib/a b \ 

c a 2 + b 2 \a 2 + 6 2 ’ a 2 + b 2 )' 

We easily verify that with this definition c(l/c) = 1, so that we have defined the reciprocal 
of a complex number except for the case of a = b = 0, which we write 1/c = oo. Then we 
take for the ratio of two complex numbers c and c' 

c /1 \ / aa' + bb' a’b — ab'\ 

c'~ C [b) = \a' 2 + b' 2 ’ a' 2 +b' 2 j * 

and we see that this, written in the i notation, is 

aa' + bb' + i(a'b~ab) (a + ib) (a’ — ib) 

d^ 2 + b 2 a' 2 + 6' 2 

We can verify that with this definition 



Also (— 6, a) (— b, a') = — (a, b) (a', b ), which we can write (ic) (ic') — — cc\ (These should 
be proved by means of the five laws of algebra stated above.) 

We have now verified that all the fundamental processes of algebra can be carried out 
with our number pairs, and the result will always be a number pair. All the rules can be 
stated in terms of real numbers, and therefore the consistency of the real number system 
implies that of the system of pairs of real numbers subject to these rules; and as each 
number pair (a, b) corresponds to a complex number c = a + ib we have a consistent 
algebra for the complex number system. 

If a, 6 are physical magnitudes they must have the same dimensions; the relation of 
complex magnitudes to complex numbers is similar to that of real magnitudes to real 
numbers. 

We have meanings at once for 9ft(c) > 9i(c') and S( C ) > S( C )5 there is no meaning for 
e>c An important related quantity is the modulus or absolute value, which we write as 
| c |, and define by 


| c | = \ a + ib \ = (a 2 + 6 2 ) 1/a . 
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Modulus of a complex number 11*02 

the positive root being taken. If \a+ib\< M, where M is some positive real number, 
then | a | and | b | separately are less than M. If | a + ib \ > M, either a 2 > or 6 2 > \M*. 
The vanishing of | c | is a necessary and sufficient condition for the vanishing of a 
and b. 

On account of this result the modulus thus defined plays the same part in the theory 
of the complex variable as in that of the real variable. We have always 

HI-Ml^l- 

Also if | c | < e, whatever positive value e may have, then c = 0 and a = b — 0. 

There are some differences in inequalities between complex and real numbers. If a 
and 6 are two real numbers we always have 


|a 2 + 6 2 | > |a 2 |. 

But for complex numbers we have not necessarily 

jc 2 + c /2 |>|c 2 |. 

For c may be 1 and c' = i. Then c 2 = 1, c' 2 = — 1, and the left side is 0 and the right 1. 
We have, however, 

\c\ + \c'\>\e+c'\, 

for 0, c, c + c' are three points in the plane and the inequality is a case of 8*01(7). 
Also 


S \c r \> 

r-l 


Also if A, p are real 


E <v 

r=l 


(Aa+/4&) 2 < (A 2 +/* 2 ) (o 2 + 6 2 ) 
by Cauchy’s inequality, and therefore 

I Aa +fib I 


a + ib 


s$(A 2 +fi 2 fK 


The notion of a limit can be extended to complex numbers. If c w — a n + ib u , and 
a n ->a , b n ->b, we say that c n ->c = a+ib. 


11 *02. Differentiation and integration of a complex function of a real variable x. 

Let <f), xjr be two functions that depend on a real variable x, and put 

f = <f> + ii/r. 

If x receives a small increment 8x and <j) and ijf corresponding increments 8<j), 8ijr, we have 

Sl = ty..8l 

8x 8x 8x * 

Making 8x tend to limit zero we have ultimately 


d<}) .drjr 
dx + % dx* 

which we take as the definition of df/dx. It exists only if both 0 and \jr are differentiable 
at the value of x considered. 
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Similarly, we can define 


cb rb cb 

I fdx= (f)dx + i\ ijrdx, 
J a Jo J a 


provided that 0 and 0 are integrable in the range a to b. 


11*03. Functions of a complex variable. Let x and y be a pair of real variables, 
and express the complex pair (a;, y) briefly by 

z = x + iy. (I) 

Let 0 and 0 be a pair of real functions of x and y, and put 


f = <f) + iifr. 


( 2 ) 


Consider what meaning we can attach to dfjdz, if any. If x and y receive small increments 
8x, 8y, 

Sz dx + idy' K } 


which we can always interpret by our rules. 

The question is whether this always tends to the same value when 8x, 8y both tend to 
Tzero. We get a necessary pair of conditions in the following way. Take Sy = 0 and then let 
8x tend to 0; then if the partial derivatives exist 


8z 0<e dx' 


( 4 ) 


But if we take 8x = 0 and then let 8y tend to 0 we get 

8z dy^dy’ 

and these can be equal only if 

00 00 00 00 “ 

dx dy 5 dy dx * 

These relations are called the Cauchy-Riemann relations. They evidently imply 

dy dx 


( 5 ) 

( 6 ) 

(7) 


If they are not satisfied, dfjdz can have no unique meaning irrespective of the limiting 
value of 8x/8y, for 8f/8z will tend to different values according as 8x/8y tends to 0 or infinity. 
The first requirement, if dfjdz is to have a meaning for all values of x, y within a range 
such that we can vary x and y independently, is therefore that (6) shall be true for all 
these values. 

The second requirement for physical applications is that the components in any 
direction of the gradients of 0 and 0 shall be derived from those in the x, y directions 
by the vector rule; that is, if 

8x = hoo$d, 8y = hsm6, ( 8 ) 

and 6 is fixed while 0, 

d>(x + h cos 6 , y + h sin 6) — 6(x, y) . 0 

—- ^ ( j )x cos 6+ f> y sin 6, (9) 

where 0 X , <j) y are independent of 6; with a similar relation for 0. If further the limit is 
approached uniformly with regard to 6, 0 and 0 are differentiable as functions of two 
variables in the sense defined in Chapter 5. 


J*F 


22 
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Taking 0 = 0 , 6 = \n we have, by definition of partial derivatives. 


dft 


eft 


dft 


dft 


*■- v 


1103 


( 10 ) 


j4 necessary and sufficient condition that (3) shall have a unique limit when 8x, 8y->0 is 
that ft, ft shall satisfy the Cauchy-Riemann relations and be differentiable * If ft, ft are 
differentiable we have 

f{x + h cos 0 , y + h sin 0 ) -f(x,y) = ft x cos 0 + ^sin 0 + ift x cos 6 + ift v sin 0 o(h) 

A(cos 6 + i sin 6) cos 6 + i sin Q h 

= ft x + i i r x+°( l )> 

by the Cauchy-Riemann relations; and if ^->-0 we have 


( 11 ) 


*<f>x + ifx> < 12 > 

which is independent of 6. Hence the condition is sufficient. 

To show that it is necessary, let 

f(x + h cos 6,y + h sin 6) —f{x, y) ^ [ ^ 
ft(cOS 0 -fisin 0 ) 

when 0 , where u and v are real and independent of 6. Multiply by 7 i(cos 0 +1 sin 0) 
and separate real and imaginary parts; then 

ft(x + h cos 0 , y + h sin 0 ) — ft{x, y) — h{u cos 0 — v sin 0 ) + o{h), 

ft(x + h cos 6,y + h sin 0) — ft{x, y) — h(u sin 0 + v cos 0) + o(h), 

and therefore ft and ft are differentiable; and 

ft X — U — fty, <f>y = -V = -ftx, 

so that the Cauchy-Riemann relations are satisfied. 

If ft t ft satisfy the above conditions, and we take axes of x', y' so that 


(13) 


(14) 

(15) 

(16) 


then 


and similarly 


x' = Ix + my, y' — —mx + ly, 

x = lx' — my', y = mx' + ly', 

dft Jdft dft ,dft dft dft 

dx dx cy dy ox cy 


dft 

dx' 


dft 
dy '■ 


(17) 

(18) 

(19) 

( 20 ) 


Hence the Cauchy-Riemann relations are satisfied for axes in any direction. In 
particular if x' is taken along the normal to a curve, and y' along the tangent, so that 
dx' = dn, dy' = ds, the rotation from dn to ds being + \tt. 


dft dft dft 
dn ds ’ 05 


dft 

dn' 


( 21 ) 


* S. Pollard, Proc. Lond. Math. Soc. (2) 21, 1923, 456—482. 
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Just as a function of a real variable may be defined only for a certain range of the 
argument, or be continuous or differentiable in a certain range, a function of a complex 
variable is defined for ranges of x and y (the range of x for given y possibly depending on 
the value of y). The set of pairs of values x, y such that the function is defined for them is 
called a region. We shall give a geometrical interpretation in a moment. The essential 
ideas have already appeared in Chapter 5. 

If within a region 

f(z) = <f> + i\}r, A(z) = [i+ iv 

and f(z), A (z) are so rdated that for any positive e there is a S (possibly depending on z) such 
that for all complex h satisfying | h | < 8 




<e, 


( 22 ) 


thenf(z) is said to be analytic in the region, and A(z) can be denoted byf'(z) or dffdz. Subject 
to these conditions <j>, \Jr satisfy the Cauchy-Riemann relations and are differentiable. 

We can also speak of f(z) as analytic in a closed region if there is also a unique limit 
when z is a boundary point and z + h is restricted to be a point of the region. 

We shall find that if ( 22 ) is true at all points of a region second derivatives of/(z) exist; 
and therefore if it is true the first derivatives are continuous. But this takes a great deal 
of proof and we are not yet in a position to assume it. 

Note that z* = x—iy is a differentiable function of x and y, but is not an analytic func¬ 
tion of z because it does not satisfy the Cauchy-Riemann relations. 

At present we shall consider only single-valued functions. This excludes functions 
like z % , which require special attention to their behaviour before we can say that they are 
differentiable. We shall see that this difficulty can be avoided when we come to consider 
branch points. An analytic single-valued function in a region is also called regular ,f 
holomorphic or monogenic in the region. 

We notice that if <j> and ijr have continuous second derivatives, 


B 2 xjr B 2 r/r 
dxdy dydx’ 


whence 


a_/_ 

dx dx By \ dyj ’ 


that is, 


0 

dx 2 By 2 


(23) 


Similarly, 


B 2 f B 2 xjr _ 
dx 2 + By 2 ’ 


(24) 


and 0 and \}r satisfy Laplace’s equation in two dimensions. 

We verify easily that if <f> x + i^r lt <f> 2 + ixjr 2 are two functions of z, their product function 
( 01^2 ~ fti ^ 2 ) + ^ 1^2 + V^ 2 ^i) satisfies the relations (6) and therefore is a function of z. 
Since z itself is a function of z it follows that z. z is another and so on to all positive integral 

t ‘Regular’ is the most usual term in recent mathematical works. We avoid it because we need 
also to speak of regular singularities of a differential equation. 


22-2 
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d 9 

powers of z. Also, since —(</> + if) = ^(^ + i^), the usual formula for differentiation of 
a product holds, namely, 


A 

dz 


dw 0 


{w x w 2 ) = w i - dz +w 2 


dw x 

dz 


(25) 


The same applies to sums and quotients; hence we infer that any rational function of z 
is an analytic function except possibly at points where the denominator is zero. 


We verify by induction that j- z n = nz n ~ x ; alternatively 


(z + h) n -z n 
(z + h) — z 


= (z + h) n ~ x + (z + h ) n ~ 2 Z + ...+ z n ~ x , 


(26) 


which gives the result on making h tend to 0. 

In general a function of the complex variable is defined originally only in a given 
region of x and y, and special devices are needed to give it a meaning outside that region. 
This does not apply to rational functions, which could be calculated for any pair of values 
by means of the fundamental rules. We need notice for them only that for certain values 
of z the division rule may fail to give a value for the function. Thus if 


f(z) = z-1, 


our rule gives 


1 

/(*) 


x — 1 — iy 


1 + iy {x— l) 2 + y 2 ’ 

both components of which have the form 0/0 when x — 1, y = 0 and are indeterminate. 
But we have for any other pair of values 

1 


I/I 


/ 


= l; 


hence if | /1 tends to 0 as z approaches some special value, | 1 //1 will ultimately exceed any 
specified finite value; we say that it tends to infinity. Such a point is an example of a pole 
of the function 1//. We notice also that if for any reason/(z) is indeterminate at a = z 0 , 


+£)-/(%>} 

is indeterminate; hence our rule for defining a derivative fails and z 0 does not belong to 
any region where the function is analytic. 

z-f -1 

This result suggests a difficulty in the treatment of such a function as / = —- at 

z = — 1. Applying the rules we find that <p and \[r take the form 0/0. But this can be 
avoided; we can show that for any z other than ± 1 


z+1 1 

z 2 -l ~z- 1 * 


which has a definite value — | at z = — 1. We can take/ = — | at z = — 1, and if we do so 
/ is found to be differentiable there. Similar considerations apply to such functions as 
(l/ 2 ) sinz at z = 0; if a function analytic elsewhere tends to a unique limit at a particular 
point, the limit may be taken as the value of the function at the point even though the 
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direct application of the rules gives no determinate result. This process is called definition 
by continuity. 

In consequence of the rules that if <;6 2 or xjr 2 > M 2 , \(fi + iifr\>M, and that if \<j) + iijf\> M 
either <j) 2 or \Jr 2 > \M 2 , we can say that if the modulus of a function tends to infinity at a 
point, either its real or its imaginary part will be unbounded, and speak of the function 
itself as unbounded. Similarly if either the real or the imaginary part is unbounded 
the modulus is unbounded. We speak of a function f(z) as bounded in a region if we can 
choose a positive quantity M such that at all points of the region |/(z) | < M. * 

11*04. The Argand diagram.* A geometrical representation of functions of a 
complex variable can be obtained by regarding x and y as rectangular coordinates of a 
point in a plane; this point is completely identified by the real and imaginary parts of z. 
Then the functions 0 and ^ are functions of position in the plane, and satisfy Laplace’s 
equation in two dimensions. From the point of view of pure mathematics this device is 
only an aid to visualization, but in physical applications it is often much more; x and y 
may really be coordinates, and <j> and \Jr quantities with definite physical meanings, which 
it is usually our task to determine as functions of x and y. Further, 

| z | = (x 2 + y 2 yh = r , 

where r is the distance of (x, y) from the origin. There will also be an angle 6 such that 

x = r cos 6, y = r sin 0, 

and therefore x + iy is equivalent to r(cos 0 + i sin 6). 

Then 6 is called the argument^ or phase of z and denoted by arg z. But 6 is not single¬ 
valued; we could alter it by any integral multiple of 2tt and still get the same values of 
cos 6 and sin 6, and hence the same values of x and y for given r. It can be made single¬ 
valued by the following device. When x > 0, y = 0, we take argz = 0; for any other z we 
take argz to vary continuously, that is, jumps of 2n are not allowed; and we make it a 
rule never to cross the negative real axis. Then for x negative and y small and positive, 
argz is nearly n; for x negative and y small and negative, argz is nearly — t r; and for all 
%, y, —7T ^ argz< 77 . This makes a change approaching 2tt on crossing the negative real 
axis, but we avoid this by not crossing it; we take argz = n on the negative real axis and 
approaching, but never attaining, as we approach the negative real axis from the 
side of y negative. Then we write -7r<argz^7r. We could equally well, of course, take 
— 7r^argz<7r. We shall have several other occasions to use this device of cuts to avoid 
ambiguities; they are particularly important in the use of conformal representation. The 
value of the argument, with this restriction, is called its principal value. 

We sometimes write, if z #= 0, 

z/|z| = cos0+tsin0 = sgnz, 
and more generally, if f(z) =f= 0,. 

/( z )/|/(z) | = sgn/(z). 

If/(z) = 0 we take sgn/(z) = 0. This expression plays the part of a direction vector. 

* Given first by C. Wessel (1797), J. R. Argand (1806). But J. Wallis (1673) is stated by 
E. T. Bell to have missed it by a hairsbreadth, if at all. 

t The word argument is also used in the sense that if f(x) is a function of x, x is called the 
argument of /(*). It will generally be clear from the context which sense is intended, but it would 
be an advantage if pure mathematicians would agree to alter one of the usages. Amplitude is also 
used; this can only be considered a disaster. 
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If f(z) is a function of z, <j> and fr can similarly be regarded as coordinates of a point in 
another plane; and the function expresses a correspondence between points in the z plane 
and in the/(z) plane. 

Iffiz) — 0 throughout a region, f{z) is constant in the region. For if/(z) = <J> + iijf, it follows 
that throughout the region 

d<f> _ d\jr _ d\jr _ 
dx dy dx dy 

But then it follows from 5*033 that <f> and \jr, and therefore/, are constant. 

Note that if <j> is constant in a region, so is rjr, by the Cauchy-Riemann relations, and 
conversely. 

11 *041. Continuity. A function f(z) is said to be continuous at z = z 0 if for any positive e 
we can choose 8 so that | f(z 0 + h) -f(z 0 ) | < e for all | h \ < 8. It follows immediately from 
the definition of an analytic function that if f(z) is analytic at z = z 0 it is also continuous 
there. 

If f(z) is analytic at all points within a given boundary it is not necessarily analytic or 
even continuous at a point on the boundary. For instance, if 

/(z) is analytic (and therefore continuous) at every value of z such that | z | < 1. The boun¬ 
dary of the region is the circle J z | = 1; but the point z = 1 is on this circle and/(z) is there 
discontinuous. 

We shall speak of/(z) as continuous in a closed region if, as in 5*031, for any z of the region 
and for any positive e there is a 8(z, e) such that for every z x of the region satisfying 

\z 1 -z\<8(z,e), 

we have | /(zj — /(z) | < e. 

11*042. Uniformity of continuity. If f(z) is continuous in a closed region, f(z) is 
uniformly continuous ; that is, for any e we can choose 8 so that |/(z + h) /(z) | <e, for all 
values of z and z+h of the closed region and satisfying | h | < 8. (j) and \jr are continuous; 
hence by 5*031, for a given e we can choose a£so that, whenever {x x , y x ) (x 2 , y 2 ) are points 
of the region satisfying 

{x 1 -x 2 ) 2 + (y 1 -y 2 ) 2 <8 i , 

1 Vi) ~ $i x 2> V*) I < e » I ^(*1’ 2^i)"" V a) I < e » 

and therefore 

|/(2i)-/( z 2)l< e V 2 * 

11 *043. Goursat’s lemma. Letf'(z) exist at all points of a closed region D; and let e be 
an arbitrary small positive quantity. Then a set of squares O r can be superposed on D , each 
containing a point z r common to O r and D such that for all other points common to O r and D 

|/(z) /( z r) - (Z- Z r)f'( Z r) \<e\z-Z r \. 

For any Z of D there is a neighbourhood of Z such that if z is in this neighbourhood and 
in D 

I f(z) -f(Z)-(z-Z)f(Z)\<e\z-Z\. 
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This neighbourhood is a region I as in the modified Heine-Borel theorem (5*021), which 
therefore applies directly. A z r may be on the boundary of a G r provided it is not exterior 
to D; if it is on a boundary between two G r it may be used for both if the inequality is 
satisfied. The G r need not all be equal. 

We shall call a G r satisfying this condition an e-neighbourhood of z r . The lemma has 
already appeared in a more general form in 5*042, but we state the case of it that we shall 
need in 11*052. 


11 *05. Integration. We attach a meaning to the integral of a function/(z) of a complex 
variable along a rectifiable curve L with termini z 0 , Z as in 5*06 (2); that is, we take as para¬ 
meter the arc s measured along the curve, and consider the set of points z r for increasing s r . 
In each interval ( s r , s r+1 ) take a point £ r of the curve and consider the sum, with z n+1 = Z, 

s n = (Z r+ 1 - Z r ) = S$(£.) (x r+1 - x r ) - vHCr) (Vr+l ~ Vr)} 

+ (Vr +1 - Vr) + ( x r+l ~ x r)}- ( 1 ) 


We take a positive quantity S, and since the curve is rectifiable we can take all the intervals 
3 r+ i — s r < 8 with n finite. As £->0, and n correspondingly to infinity, the sums in S n 
define the two sums of Stieltjes integrals J<j>dx — r}rdy , J <j>dy + \jrdx. x , y are of bounded 
variation on the curve; if/(z) is analytic on the curve, (f> and \jr are continuous on the curve 
with regard to s. Hence the integrals exist and we can write 


lim# n = f f(z)dz. (2) 

*->-0 Jl 

We sometimes need to consider the integral of a function along a curve that forms part 
of the boundary of the region where the function is analytic. Then $ and \fr may not be 
continuous on the curve but the integrals exist subject to the conditions stated for Stieltjes 
integrals. 

Note that if |/(z) | < M at all points of the curve 


IM) (Zr+1 ~Z r )\<M\ Z r+X - Z r \, 

I s/(z r ) (z r+1 -z r )\<m:\ Z r+1 -Z r |, 

and therefore, proceeding to the limit, if K is the length of the curve, 



^ MK. 


11*051. Two special integrals. Clearly 

[ dz = lim{(z 1 -z 0 ) + (z 2 -z 1 ) + ... + (Z-z w )} 

J Z, 

= Z-Z 0 . 

Also z r _i(z r - z r _i) = \{z* r - zl_ x - (z r - z r -i) 2 } 

and therefore if z n+x = Z 

»+i 

Zo(Zi-Zo) + z 1 (z 2 -z x ) + ...+z n (Z-z n ) = i(^ 2 -zg)-| S (Zr-V-l)** 

r-1 


( 3 ) 

(*) 

(«) 
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Take points z r at 
is < (oK , and 


Cauchy's theorem 11*052 

intervals < (t) along the path; then the modulus of the sum on the right 
■z 


; 


zdz = \{Z*-zl). 


( 2 ) 


Consider now the integral of f(z) round a contour C in an e-neighbourhood of z 0 . Let the 
upper bound of | z — z 0 | be a. Then 

f(z) = f(z 0 ) + (z- z 0 )f'(z 0 ) + (z- z 0 ) v, (3> 

where | v | < e; and 

jj( z ) dz = j c {f(z 0 ) + (z-z 0 )f(z 0 )}dz+ ^(z-z 0 )vdz. (4) 


The first integral on the right is zero by (1) and (2) because the termini are identical. Also 

|J {z — zf)vdz\<^aeds<eaX, (5) 


where A is the length of the boundary C. Hence 


jj( z ) dz 


< ea\. 


( 6 ) 


We are concerned with two types of contour: (1) a square of side 6, so that 

a^b<j2, A = 46, uA^ 4 J2b *< 66 2 ; 

(2) a square with part of its boundary replaced by a curve within the square. In the second 
case let the length of the curved part be y. Then in case (1) 


jj( z ) dz 


< 6e.4, 


where A is the area; and in case (2), since A < 46 +/i, 


jj( z ) dz 


<6eA + *j2.eby. 


(7) 


( 8 ) 


A being the whole area of the square. 

11 *052. Cauchy’s theorem. Iff{z) is analytic in the closed region bounded by C, where C 

is a contour of finite length, then f{z) dz = 0. 

J c 

Denote C and its interior by D. We can surround C by a 
square E of side B. By Goursat’s lemma, for any positive o) 
we can subdivide E into a finite set of squares G n of side 6 n , 
such that for every G n containing points of D there is a point z n 
common to G n and D such that for every z common to G n and D 

1/(2) -f(z n ) -{z- z n )f'{z n ) \<(o\z-z n \. 

Some of the G n will be wholly within D. Others will be inter¬ 
sected by C. Consider for each G n the integral J f(z) dz taken 
round the boundary of G n if G n is entirely within D; if G n is intersected by C, take the 
integral about the part of the boundary of G n that lies in D together with the part of C that 
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lies in O n . Then all these are closed contours satisfying the conditions of the last section 
of 11*051. If the area of an internal square G n is A n> 


f f(z)dz 
J G n 


< 6o)A n . 


If G p is a square of side b p overlapping C, and the part of C within it has length fi p , 


f f(z)dz 
J G p 


< 6 o)A p + V 2 • <*>b p ji p . 


Take the sum of all the integrals, all circuits being described in the direction of positive 
rotation. Then every interior side or part of a side of a G n traversed in forming the 
corresponding integral is traversed in the opposite direction in forming that for an 
adjacent G n . Hence all contributions from internal sides cancel and the sum is simply 
the integral about C. Hence 


/< 


f(z)dz 


< 6<y(IL4 n 4- 5L4p) + 2(o5jb p /i p , 


Since the G n , G p are non-overlapping parts of E, 




Also b p ^ B; and 'L/jl,, — L, the length of C, which is supposed finite. Hence 



<w(6H 2 + 2H£). 


The left side and the second factor on the right are independent of <o. Hence, since 0 ) is 
arbitrarily small, - 

J f(z) dz = 0. 


Cauchy’s theorem is the pivotal theorem of the theory of the complex variable. It 
often seems surprising at first sight that such a restriction on the boundary values of a 
function should be deducible merely from the condition that the function must have a 
derivative within and on the boundary. It becomes less surprising, however, when we see 
first that the function should really be regarded as a pair of functions and that the defini¬ 
tion of a derivative that is being used implies two exact relations between the partial 
derivatives of these functions at every point where its existence is asserted. 

Several mathematicians have contributed to the relaxation of the conditions for the 
theorem. The proof subject only to the conditions that f(z) has a derivative and G has 
a finite length is due to Goursat. 


11*053. Relation to Green’s lemma. An alternative approach to Cauchy’s 
theorem, similar to Riemann’s treatment, is to use the two-dimensional form of Green’s 
lemma. If u and v are two functions of x and y with continuous first derivatives within 

andonC ’ rr/8» m , , r „ ., 

})[di + ty} dxdy= )c (lv ' +mv)d3 ’ (1) 


where the integral on the left is taken through the interior of C, and l and m are the 
direction cosines of the outward normal to G with regard to the axes of x and y. The 
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Green's lemma 


11*053 


conditions are more restrictive than in the proof already given since they assume con¬ 
tinuity of the separate derivatives; this is the most manageable sufficient condition for 
the reversal of the order of integration in the proof of Green’s theorem. The direction 
cosines of the tangent to C, proceeding in the direction of positive rotation, are (— m, l), 
so that for displacements along G 

dx dy 


da 


= -m, 


ds 


= l 


and 


( 2 ) 

j (lu + mv)d8 = J ( udy — vdx ). (3) 

Now take/(z) = <f> + ii/r, where <f>, rjr satisfy the Cauchy-Riemann relations; then 
I/®* So (<f> + iifr)(dx + idy) =J (^>dx-ijrdy) + ij^(i/rdx+<j>dy) 

-jS( J i J i) dxd!/+i SS(~% +d S dxdy ^ 0 - <4) 


Green’s lemma r»a.n be proved in three dimensions in somewhat wider conditions than we used 
in 5-08, if we use methods similar to those used in Cauchy’s theorem. Note first that if 


u t — a t + b tJt x jc. 


where a t and b ik are constants, the result follows at once. Note also that if the volume integral of 
div u exists the interior of S can be divided up into regions of volume 8r n such that however we choose 
a specimen value of div u in each, at R n , say, we cannot alter the sum E (div u) n 8r n by more than e. 
Now if P is x t and Q is x t + x' it where x? = r*, and each u t is differentiable, there is a neighbourhood of 
P such that 

tt((Q) = u t (P) + x' k 


where | v t | < at for r < 8. Then over a surface 0 P of volume r„ enclosing P 


/WHS), 


r„ + 


fff l t v t rdS. 

JJJGp 


( 0 ) 


By subdivisions of the regions 8r n into 8't 9 and making straightforward modifications of the proof 
of Cauchy’s theorem we can show that the sum of the values of the last term for all G P , added for all 
8t„, is less in modulus than Mb), where M is fixed. Hence there is at least one P np within each 8't p 
such that 




S't, 


<M<a, 


(7) 


and, since the volume integral exists 


Hence 


J JJ kutdS-j JJ dr j < M(a + 2e, 


( 8 ) 

(9) 

( 10 ) 


and therefore must be zero since e and 0 ) are both arbitrary. Hence sufficient conditions for Green’s 
are that u t is differentiable on and inside S, that the volume integral of divu exists, and that 
S is bounded and has a finite area given as in 5*07. 
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Since diff erentiability is a sufficient condition for the derivatives of u f to be a tensor, these con¬ 
ditions cover all cases where we should want to use Green’s lemma. 

Note that the proof must proceed in two stages, the usual rectangular subdivision being used in 
each. In the first we make the regions 8t„ small enough for the sum of (div u) n <Jr n to give a good 
approximation to the volume integral; in the second we subdivide the 8r n into 8't p so that the 
properties of differentiable functions can be used to establish the approximation of the sum to the 
surface integral. The former step does not arise in Cauchy’s theorem because the analogue of divu 
is zero. 

11*054. Extension of Cauchy’s theorem. The theorem also holds under the 
following conditions: f(z) is analytic within C, and continuous in the closed region bounded 
by C. Following Goursat,* we suppose that C is such that there is at least one internal 
point c such that every straight line from c intersects C just once. Then if z' = c + A(z — c), 
where 0 < A < 1, and z is on C , z’ is within C, and as z travels around C, z ' describes a contour 

C' arbitrarily near to C. Then f(z') dz' — 0, by what we have proved; that is, 

J c 

jJ{c + X(z-c)}Xdz - 0; (1) 


and we can omit the constant non-zero factor A. But by the principle of uniform con¬ 
tinuity, for any o) we can take 8 so that | f(z) —f(z') | < (o for all [ z — z' | < 8. But 



\z-z'\ = (1 — A) j z—c | 

(2) 

and | z — c | is bounded, say < JR. Hence the condition is satisfied if 



1 — A = 8/R. 

(3) 

Then 

jj(z) dz - dz = {f(z) -/(z')} dz 

(4) 

and 

If {f(z)-f(z')}dz\<(oL. 
j J c ! 

(5) 

Hence f(z) dz 

1 Jo 

< (oL and therefore is zero. 



If C is such that its interior can be cut up into a finite number of regions each with a 
suitable internal point, the result follows by addition. There are contours C of finite 
length that require an infinite number of subdivisions before each region will satisfy 
Goursat’s condition, and then the theorem remains true, but becomes much more difficult 
to prove.f We shall not need to consider such cases. 

This extension of Cauchy’s theorem appears to be necessary for some physical applica¬ 
tions, where we wish to determine a function analytic within a contour and taking given 
values on the contour, though the derivative of the function at some points on the contour 
may not exist in the sense of the theory of functions of a complex variable, or, indeed, 
in that of the theory for a real variable. The further extension to cases where f(z) 
behaves like 1 /(z —z 0 ) near a point z 0 on the contour is impossible because there is no 
unique way of defining the integral through such a point; we shall refer later to the 
principal value of such an integral, but this is not equal to the limit of the integral 
around C'. 

* Cours cTAnalyse, 2, 1905, 88. 

f S. Pollard, Proc. Lond. Math. Soc. (2), 21, 1923,*456-82; M. H. A. Newman, Topology of Plane 
Sets of Points, 1939, 154-6. 
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The theorem remains true if f(z) is bounded but has a finite number of discontinuities 
on C. We cut out parts of C by drawing circular arcs of radius y about the discontinuities, 
and take C" to consist of the parts of these arcs within C and of the rest of C. Then the 


theorem applies to show that J f(z)dz = 0. For any arc and the part cut out,J/(z) dz = 0 ( 7 ]), 

since f(z) is bounded; hence the theorem again follows. But (cf. 14-05) a simple discon¬ 
tinuity of fitfiz) implies that $/(z) is unbounded.* 


11-055. An important corollary follows at once. Suppose that/(z) is analytic between 
two contours C and G', of which C encloses O', and continuous as z approaches either C 
or C'. Draw two lines AB, EF close together, so as to connect 
the two contours. Then ABDEFOA, described as shown, is a 
closed contour and f(z) is analytic within it and continuous as 
z approaches it. Denote it by S. Then 

j s f( z ) dz = 0. 

Now let the lines AB, FE be made to approach indefinitely 
close together. The contribution from the part BDE tends to the 
integral around G in the positive direction. That from FOA 
tends to that round O' in the negative direction and therefore 
to minus that round C' in the positive direction. The contributions from AB, EF ap¬ 
proach equal and opposite values since they ultimately become the same path described 
in opposite directions. Hence if we agree to take the same sense of description of both 
contours, 

I* f(z)dz =r f(z)dz. 

JC JC' 

Hence: if a function is analytic between two contours, and continuous on approaching them , 
its integral with regard to z round each contour has the same value. 

If the argument used in proving Cauchy’s theorem is applied to the region between 
C and C', the result will be seen to follow directly. We need a separate proof in this 
case only because in proving Cauchy’s theorem we assumed the region to be simply 
connected. 

An immediate extension is to the case where C encloses several closed paths C, G", ..., 
all external to one another. We can show similarly that the integral around C is equal to 
the sum of the integrals around C',C, ..., provided that the function is analytic at all 
points that lie within C and outside C', C", ... and continuous in the closed region. 

11-056. Integral of an analytic function. It follows from Cauchy’s theorem that 
if L, L’ are two paths connecting z 0 and Z, and f(z) is analytic on L, L' and at all points 
between them, 

f f(*)dz = f f(z)dz = F(Z), 

J L J U 

* A general proof is given by Littlewood, Theory of Functions, p. 144. 
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say, and F(Z) is single-valued. Also if Z + £ is in the region wher ef(z) is analytic, and | £ | is 
small enough, we can take L” to coincide with L from z 0 to Z and then to proceed to Z + £ 
in a straight line. Hence 

rz+c 

F(Z + Q-F(Z) = I f(z)dz, 

where the integral is along a straight line. Let £ tend to 0 ;f(z) will differ from f(Z) by an 
arbitrarily small amount, and 

Hence F'(z) = /(z), and F(z) is an analytic function of z. 

This theorem should be compared with the three-dimensional one of 5-09 (3). 

It follows that if G(z) is analytic within a region and G'(z) — /(z), F(z) — G(z) is constant; 
for it is an analytic function with zero derivative. This enables us to extend to complex 
integrals the method of integration by finding an indefinite integral. 

11*06. Power series. The fundamental rules, if applied a finite number of times, 
will define a rational function of z. But other functions can be obtained by considering 
sums of infinite series, the most important of which are those in positive integral powers 
of z. Consider then the series 

/(z) = a 0 + a 1 z + a 2 z 2 + ...+a n z n + (1) 
where the a n may be real or complex. Consider also the companion series 

g{r) = b 0 + b 1 r + b 2 r 2 + ... + b n r n +..., (2) 

where b n = | a n ], r — \ z |. (3) 

According to the value of r, the terms b n r n may be bounded or not. That is, we have 
Case 1. There is an M such that b n r n < M for all n. 

Case 2. For any M there is an n such that b n r n > M. 

The geometric series 1 + z + z 2 +... (4) 

comes under Case 1 for r < 1, and under Case 2 for r> 1. 

z 2 z n 

The exponential series 1 + z +-^+... +^-y+... (5) 

comes under Case 1 for all r. For if we take m>2r and n>m, 

tyTb /y7n yfl—Tfl 

_I_mn-m /g\ 

n\ m\{m+l)...n m\ 2 

Then if If is the largest of 1, r, r 2 / 2!... r m /m!, it follows that r n fn ! ^M for all n. 

The series l + z +2!z 2 +... + .„!z n + ... 

comes under Case 2 for all r > 0; for if we take m > 2 [r and n>m 
n\r n = m!r m . (w+1)... nr n ~ m >m\r m . 2 n_m , 

which can be made to exceed any M by taking n large enough, and all later terms are 
larger still. 
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Thus there are series that fall within Case 1 for every value of r, others that are always 
within Case 2 except for r = 0, and others again that are within Case 1 for some values 
of r and Case 2 for others. 

Every term of g(r) increases with r unless it is 0 for all r. Hence if the series is in 
Case 1 for r — r x and Case 2 for r = r 2 , we must have r x < r 2 . Now suppose that values 
of r x and r 2 satisfying these conditions are found. The property ‘the terms of gif) are 
bounded for r < r 0 ’ defines a cut in the positive values of r 0 , say at r 0 = R. 

Hence if g(r) belongs to Case 1 for some values of r and to Case 2 for others there is always 
a unique quantity R associated with the series such that all values of r < R belong to 
Case 1 and all greater than R to Case 2. We call this the radius of convergence and the circle 
| z | = R the circle of convergence. If g(r) belongs to Case 1 for all r we can write R = oo. 

When r — R, g{r) may be in either Case 1 or Case 2. Thus for (4) and the series 

, o 1 

1 +z + %z*+... + -z n +... 
z n 


if r = 1, r» and r n jn are ^ 1 for all n; and r n and r n fn are unbounded if r > 1. Thus the 
radius of convergence is 1 and the terms are bounded on the circle of convergence. 


But for the series 


l + 2z + 3z 2 +... + (w+l)z n +... 


the radius of convergence is again 1, but the terms are unbounded on the circle of con¬ 
vergence. 

11*061. Absolute convergence. Suppose now that c is any positive quantity less 
than R. We know that for r-c the terms of g(r) are bounded, that is, they do not increase 
indefinitely; hence there is a quantity M such that 

b n c n < M 


for all n. Hence for any r < c 
Further, for any m and p(m<p), 


p 

s 

n =m +1 


b„ r n < M I - 


b n r n <M(r/c) n . 

( 


m+lj r 

1H—h ... 4* 
c 


m-1 

c 


-jrflT 11 


(r/c)*»“" 


<jr U 


m+l 


1 — r/c* 


1-r/c 

Hence if we choose an e, however small, we can choose m 0 so that the sum of terms after 
the mth for m > m 0 will never exceed e however many we take. The series S6 n r n therefore 

converges. Further ^ , 

I* V. |6 n z"|= S b n r\ 


»=m+l 


V 

s 

»=m+1 


and therefore the series £a n z n converges for | z| =r<c<R, and therefore for \z\<R. 

Within the circle of convergence the sum of the moduli of the terms is a convergent 
series. Such convergence is called absolute by analogy with the corresponding property 
for real series.* 

* It is usual to take the property “the series is convergent for any z such that | z \ <r 0 ” as de¬ 
fining R by a cut in the values of r 0 ; but it appears to us slightly more obvious that boundedness 
of the terms defines a cut, and this boundedness is used directly in proving many later theorems. 





11*062 Convergence of power series 

It is obvious that the series never converges if j z | > R. 

Sinoe (1 — r/c) _1 is bounded in any circle j z | < d < c, we have always 


m 

/(z)“ 


n —0 


0(z m+1 ). 
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11*062. Uniform convergence. The definitions of uniform convergence for sequences 
and series of analytic functions are immediate extensions of those for functions of a real 
variable. If for all z in a region f n (z) ->/ (z) , and 

/(*) = fn( Z )+ R n( z )> 

and if for any positive e we can choose n so that for every 2 in the region | R n {z) | < e, 
the sequence {f n {z)} is said to be uniformly convergent to f(z) in the region.* If 

fn( Z ) = 4>n + i 'I / 'n-+ ( i> + i ft’ 

it follows that with the same choice of n 

I 0«-0l< e > 

and therefore {0 n }, {r/r n } are uniformly convergent to 0, \}r respectively. Conversely if 
{0 n }, {i/r n } are uniformly convergent to 0, \Jr, {0 n + ^0 r „} is uniformly convergent to 0 + i^r. 

co 

If we choose d<c<R in 11-061, and m so that £ b n d n <e, then for any z such 

n=m+1 

that j r 1 = r^d 

£ |fv“l<e. £ “» zn 

»=m+l n=m+1 


CO 

^ X b n r n <e. 

n=m +1 


That is, we can choose m once for all, given e and d, and it will do for all values of 
r within a range up to and including d. We thus arrive at a case of the M test for uniform 
convergence, extended to the complex variable. Formally we may state it as follows: 
If for all values of z in a region \ u n (z ) | is less than v n , which is independent of z , and the series 
is convergent, then the series 'Lu n {z) is uniformly convergent in the region. This test is 
sufficient for uniform convergence, but not necessary. We see that any series satisfying it is 
also absolutely convergent in the region, and a series can be uniformly convergent without 
being absolutely convergent. We thus have the theorem: A power series in z is uniformly 
convergent within and on any circle with centre z — 0 and lying wholly within the circle of 
convergence. It may not converge uniformly, or even converge at all, on the circle of 
convergence itself. 

It follows that if a power series has a radius of convergence R different from 0, then 
for any z such that | z | ^ c < R the sum of the series has a definite value; it therefore defines 
a pair of functions <fi(x, y) and y). Each of them is the sum of two uniformly convergent 
real series. If g{r) belongs to Case 1 for all values of r, we may take any finite value for 
c in 11-061, and the argument proceeds as before. In that case Sa n z n defines such a pair 
of functions over the whole plane. 

Some examples will now be given to illustrate the possible modes of behaviour of power 
series on the circle of convergence. We have considered the behaviour of the separate 
terms; we have now to consider the sums. 

* We are not assuming at present that/(z) is analytic; but cf. 11-20. 
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11*063. Types of power series: integral functions. Consider again the series 

/(z) = l + z + z 2 + .... (1) 

It does not converge for any value on the circle | z | = 1, since all the terms have modulus 1 
and the sum of n terms tends to no limit. This series is easily summed for j z | < 1; as for 
the real variable , 

/(*> = w (2) 

We notice that though/(z) as defined by (1) is meaningless for [ z | ^ 1, the expression (2) 
has a definite value for any z except z = 1. 

z n 

If we take the series f(z) = z4- |z 2 + ...+— 4-(3) 

we find a different behaviour; as for the last, it converges for all | z | < 1 and diverges for 
all | z j > 1, but it also converges for all | z j = 1 except for z = 1 itself.* This series does not 
represent a rational function; it can be taken as the definition of — log (l — z). 

The series f(z) = z+ |i+•••+$+••• ( 4 ) 

has the same circle of convergence but converges even at z = 1. Thus we have three series 
with the same circle of convergence but behaving radically differently on the circle itself. 

z 2 z n 

The series expz = 1 + z+2i+...+~ ! + ..., < 6 ) 

on the other hand, converges for any z. 

Functions definable by the same power series over the whole plane are called integral 
functions. Apart from terminating series, the exponential series is the most familiar 
example; closely related to it are the functions coshz, sinhz, cosz, sinz. 

The series H-l!z+2!z 2 -f-...+7i!z w +... (6) 

is not convergent for any z other than 0, however small. Such a series defines no function 
except for z = 0, and we may say that its radius of convergence is zero. 

11*07. Differentiation of power series. We have still to show that a function 
defined by a power series is analytic. If f(z) is defined by Sa n z n we may call the series 

a x + 2a 2 z + 3a 3 z 2 + ... + na n z n ~ x -f... 

obtained by differentiating term by term, the first derived series. We can show easily 
that if R is the radius of the circle of convergence of the series defining/(z), it is also that 
for the derived series. We can construct derived series of higher orders similarly. 

A power series can always be integrated term by term within the circle of convergence, 
because it is uniformly convergent (1*113). Hence if 

/(z) = a Q + a 1 z+... +a n z n + 
we have f f(z)dz = F(Z) —F(z 0 ), 

J S, 

where F(z) = a 0 z + \a x z 2 + ... + — a n z n+1 +.... 

72f t A 


* Cf. 1-1155. 
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S imila rly, we may start with the derived series and infer, since this has the same circle 
of convergence, 

' z 

{a x + 2a 2 z+...+na n z n - x +...}dz = [a 0 + a 1 z+... + a n z n + ...]g 

= f(Z)-f(z 0 ). 

Differentiating with regard to Z, we have 

a x + 2a 2 Z +... +na n Z n ~ 1 +... = df(Z)/dZ , 


j: 


and therefore the derived series is the derivative of the original series anywhere within 
the circle of convergence. This shows further that a function defined by a power series 
has a derivative and therefore is an analytic function within the circle of convergence. 
The second derived series similarly represents the second derivative of f(z), and since it 
converges the first derivative is continuous. Thus functions defined by power series 
satisfy the conditions used in 11*053. 


11*08. Multiplication of power series. It can also be proved that if two series 
f(z) = a Q + a x z + a 2 z z + ..., g(z) = b 0 + b x z + b 2 z z + ..., 
both converge within any circle, then the product series 

h(z) = c 0 + c 1 z + c 2 z 2 + ..., 

where c 0 = a 0 b 0 , c x = a 0 b x + a x b 0 , c 2 = a 0 b 2 + a x b x +a 2 b 0 , ...» 

obtained by multiplying terms in pairs and collecting coefficients of the same power of z, 
converges within the same circle and is equal to f(z)g(z). The proof is similar to that for 
absolutely convergent series of real terms. 

It can also be proved that if a series is absolutely convergent it will give the same sum 
when the terms are taken in any order. 

An immediate application of a similar argument gives 

exp z exp z' = exp (z + z') 

for all z, z', as for the case of two real variables. Since if we write e z for exp z, z obeys the 

usual rules of indices, we can take this as a definition of e z when z is complex. It should 

be noticed that e z , from this point of view, is not to be regarded as the result of a process 

of raising e to the power z. Thus if we take z = \, exp \ is a unique number defined as the 

sum of the series ^ ^ ^ 

1 -4-1-1- [- 

2 2!2 2 3!2 3 

but the result of taking the square root of e might be either + exp |. We take e z to mean 
the same as exp z, but other conventions are in use. 


11*09. Limit-points. The definition of a limit-point in more than one dimension has 
been given in 5*01. We recall that any neighbourhood of a limit-point of a set contains 
an infinite number of members of the set and that any bounded infinite set has at least 
one limit-point (5*02). 


11*091. A power series cannot have z = 0 as a limit-point of zeros, unless it vanishes 
for all z. For suppose that the series 

f(z) = a 0 + a x z + a 2 z z +... 

has a zero sum for some non-zero z within any circle about 0. If possible, suppose that 
there is at least one term with a non-zero coefficient. Let the first term with a non-zero 
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1110 


Taylor’s theorem 

to 

coefficient be a m z m . Then g(z) = z^fiz) = 2 a n z n ~ m is a power series with the same circle 

n=m 

of convergence as f(z). Therefore it is uniformly convergent and therefore continuous in 
a neighbourhood of z = 0. But it is zero for some point in any neighbourhood of 0; hence 
gr(0) = 0 and a m — 0. Hence/(z) = 0 for all z. 

In particular, if f(z) = 0 for all points within a circle about 0, however small, or even 
for all points on the real axis within some finite range of x,f(z) is 0 for all z. But these 
conditions, which are those usually given in practice, are more than sufficient for the 
truth of the theorem: it would be enough, for instance, if/(1/n) = 0 for every integral n 
greater than 1000. 

The most important application is that a function can have only one expansion in 
powers of z. For if, for \ z\<R, 

f(z) = 'Za n z n = ’Za / n z n i 

we have 2 ( a n — af n ) z n = 0 

for all \z\< R. Hence a n = a' n . This justifies the method of equating coefficients of 
powers of z. 


11*10. Taylor’s theorem. Let 

f(z) = a 0 + a 1 z + a i z 2 +... (1) 

with radius of convergence R. Let as before c be less than R and let all | a n c n | be less 
than M. Then 

K = I a n I < M ! cn • 


Take z 0 such that | z 0 1 < c and put z = z 0 + z\ Substitute in the series, and expand each 
term by the binomial theorem; we have 


f(z 0 4- z') = cLq 4- ®i(2q 4* z ) 4- cl^Zq 4- 2zqZ + z **) 4*.•. + a n 2 


n\ 


z n^m z 'm + 

=o min —ml 


(3) 


Consider the companion series obtained by taking the modulus of every term; writing 
as before | z | = r, | z 0 1 = r 0 , | z' | = r', we have 


n\ 


OO H M | 

<M 2 S ' 0 


M 2 

«=o 


»= 0 m= 0 «i!(w-®)! c n 

(r 0 + r') n 


(4) 


which converges if r 0 + r' cg, 

that is, within any circle with centre z 0 that does not pass beyond the circle of radius c. 
Hence the series (3) is absolutely convergent within such a circle. Its terms can therefore 
be taken in any order and will always give a convergent series with the same sum. Take 
them in order of ascending powers of z’. The terms independent of z' are 

CI-^Zq-\- a^z^-\-... =/(z 0 )> 
a x 4-2a 2 z 0 4-3a 3 zjj +... =f (zq), 


the coefficient of z' is 


(5) 

( 6 ) 
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n\a r 


and in general the coefficient of z' m is 

a m +(m+ 1) a m+1 z 0 +... + m j ^_ n m j j z o _m + —» 

where n^m; on putting n — m = n', the general term is 


(n f + m)l 
m ! n '! n ' +m 0 ’ 
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(7) 

( 8 ) 


where n' > 0. Then the coefficient of z' m can be written 

—Am(m — 1)... 1.o m + (m +1)m... 2a m+1 z 0 + ... + (n + m)(n + m—l)...(n+l)a m+n Zo + • • •} 
ml 

(9) 

ml 

the bracketed index indicating the mth derivative. Hence j 7 _ o 

m-«+«'/'(«.)+^rw+-+^*TW+- ( 10 > 

within any circle about z 0 that does not reach the circle of convergence of the series (1). 

This is the form taken by Taylor’s theorem when the variable is complex. The infinite 
series always converges, and there is no need for a remainder term. 

11*11. Singularities. We have so far restricted ourselves to functions that are 
uniquely defined at each point of a region and differentiable and therefore continuous 
for all variations of z within the region. We have thus excluded any function if at any 
point of the region it is capable of taking two or more values or if it is non-differentiable; 
in particular, if it tends to infinity as z approaches some point in the region. We proceed 
now to consider what can happen in the latter cases. 

A singularity a of a function f(z) is any value of z such that we cannot choose a 
positive 8 so that f(z) is analytic and single-valued for | z — a j <8. It follows from the 
definition that a limit-point of singularities is a singularity. 

(a) Branch points. Consider the function z 1/a . This is finite for all finite z, but even for 
z real and positive there is an ambiguity about which sign shall be taken. Suppose that 
we agree to take the positive root in that case. Now let us proceed in a circle about the 
origin in the direction of increasing argument, and let z 1/a vary continuously. Then if 
we put 

z = re w , 

we can take z 1/a = r 1/a e i/a<(? . 

We are not varying r and therefore need not vary r 1/a , and 0 must vary continuously. But 
when 6 has reached 2 tt, \Q has reached n, and e 1 !* 19 is — 1. Increasing 0 further we repeat 
all the previous values with the opposite sign. Thus we cannot attach a single value to z 1/a 
at every point if we allow 6 to vary by more than 2n with r constant. But if we make it 
a rule that 6 is never to reach 2 tt, we can make z 1/a single-valued. For then, though we may 
make 6 as near 2n as we like, the only way of getting back to the positive real axis is to 
make the circuit of the origin in the opposite direction, and in doing so we undo the previous 
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variation of 6 and arrive back at the original value of z 1/a . To use this device it will be 
necessary to exclude the possibility of a complete circuit for every value of r, and we speak 
of a cut along the positive real axis. This amounts to defining 

z 1/a = r 1/a exp \%Q (0 < 0 < 27r), 

where we always take the positive value of r I/a . We could equally well take the range of 
0 to be —7T<d^7T, or —it^6 <tt. In general if n is fractional we can define 

z n _ r n exipnid 

with the same restrictions on 6. 

A function may be single-valued and even differentiable atz = 0 without being analytic 
there; z* = 0 at z — 0, and z 8/a = 0 and has a zero derivative there, no matter what sign we 
take first. The definition of a branch point concerns its neighbourhood ; a point z = a is a 
branchpoint off(z) if when z moves around a in an arbitrarily small circle, not of zero radius , 
the value of f(z) being chosen at each value of z to preserve continuity, f(z) does not return to 
its original value. 

If f(z) = (z 2 — a 2 ) 1/a , 


where a is real, there are branch points at ± a. If z makes a complete circuit about any 
curve including both of them, /(z) returns to its original value; for the square roots of 
z — a and z + a both change sign and their product has its original sign. But f(z) would be 
reversed if we went round any circuit that included either of + a and excluded the other. 
Here we can make /(z) single-valued by making it a rule that we take the positive sign 
on the real axis when x>a, and never cross the real axis between —a and +a. 

When we make a cut, we select one value of the function for every point in the 
region, and have a single-valued function in the region. But it is discontinuous when z 
crosses the cut, which is therefore a line of singularities. Thus if z 1/a is defined by 
r 1/a exp \id ( — n<0^n), where we take the positive sign for z real and positive, z 1/a has 
a discontinuity 2ir 1/a when z crosses the negative real axis. If we took z 1/a as meaning 
— r 1/2 exp \id, we should get a different single-valued function, which can be called a 
different branch of z 1/a . In what follows we shall assume that all functions are single¬ 
valued or have been made so by means of a cut or cuts. 

If f(z) is not single-valued on a contour C, that is, if it does not return to its original 
value when z describes the contour, f(z) varying continuously, f(z) has a singularity within or 
on the contour. For if /(z) is analytic within and on the contour we can superpose a net of 
squares such that /(z) varies continuously when z describes any interior square or any 
fringing portion as in 11-052 and returns to its original value. It follows by addition 
that the change of f(z) when z describes C is zero. 


(6) Poles. A function /(z) may be unbounded in any circle about a, however small, 
but be such that when we make a circuit of the point the function returns to its original 
value. A pole of/(z) of order m is a point a such that there is a positive integer m such 


that for z#=a, 


/(*) = 


{z-a) m (z — ay 


h+-+; 


+ 9(z)> 


in a region enclosing a, where A m 4= 0, and g(z) is analytic at a. The terms containing 
negative powers of z — a are called the principal part of the function near z = a. Poles of 
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order 1 are also called simple poles. We do not speak of poles of non-integral order, since 
a would be a branch point of ( z—a)~ n if n was not an integer. 

If a is a pole of order m, ( z—a) m f(z) is analytic in a neighbourhood of a, and so is 
1 /{/(z) — c}, where c is any constant. 

(c) Essential singularities. An essential singularity is any point a, not a branch point 
or a pole, such that/(z) is not analytic in a neighbourhood of a. Exp (1/z) has an essential 
singularity at z = 0. If z is real and positive, then whatever m we choose, z m exp (1/z) 
tends to infinity as z tends to 0. The behaviour of functions near essential singularities is 
more peculiar than near poles. If/(z) has a pole at a, then for all methods of approach to a, 

| /(z) | tends to infinity. But if z tends to 0 through negative real values exp (1/z) tends 
to 0. We can say, if we like, that/(z) is infinite at a pole, provided that we understand that 
we mean by this nothing more than that, for every sequence of points z n tending to the 
pole, | f(z n ) | —>oo. We cannot say that it is infinite at an essential singularity because it 
may be possible by choosing the method of approach suitably to make the limit of f(z) 
finite. Thus if we take the equation 

exp (1/z) = 6, 

where b is not 0 or oo, it is satisfied wherever 

1 

2 == __ 

2nin + log b ’ 

n being an integer, and by taking n larger and larger we can make z as near as we like to 0 
while keeping exp (1/z) always equal to 6. It will be seen that when a many-valued 
function has been replaced by a single-valued one by means of a cut, every point of 
the cut is an essential singularity. 

(d) A formally possible type of singularity, as for the real variable, is a removable 
discontinuity. If /(z) is analytic in any ring 0 < £ < \z — a\ <c, where 8 is arbitrarily 
small, and if/(z)->d when z->a in any manner, but f(a) 4= d, we call a a removable 
discontinuity. Such singularities have no practical importance but are mentioned for 
completeness. We shall always suppose that if/(z) tends to a unique limi t, d when z->a 
in any manner, then/(a) = d. 

11*111. If/(z) = Sa B z n for all z such that \ z\<R, there is no singularity of/(z) for 
\z\<R. For 2a n z n is single-valued and has a derivative everywhere within the circle 
of convergence. 

If /(z) has a fine of discontinuity, the circle of convergence may overlap this fine. 
Then /(z) has no singularity within the part of the circle that includes z = 0. In the 
other part the series will converge but not be equal to /(z). 

11*112. Singularities at infinity. If z = l/£ and 

/(z) = 

g(Q may either be analytic at £ = 0 or have a branch point, a pole of order m, or an essen¬ 
tial singularity there. In these cases we say respectively that /(z) is analytic at z = oo, 
or has a branch point, a pole of order m, or an essential singularity at z = oo. This extension 
of the definitions saves some writing. 





>8 Integrals around poles; log z 

11*113. Integrals around poles. The equation 


11*113 


— = mz m ~ l (1) 

is extended to cases where m is a negative integer by using the equation 

d , , dv du 

iz {m) = U Jz + V Tz <2) 

with u — z m , v = z~ m . The left side vanishes, and with m a negative integer 

d V m 1 /ON 

- = -mz- m - 1 , '3) 

dz 

whence the result follows. Hence we have (1) for all integral m, positive or negative. Then 
by using (1) we have for any integral m other than 0 


[ Z z^dz = ~(Z m -z?). 
J*o m 


The arg um ent fails for m = 0, since the derivative of z° is not a multiple of z -1 but zero. 
We define provisionally 

logz = /i 7* ^ 

The integrand is analytic within any region that does not include the origin, and by 
Cauchy’s theorem the integral has the same value for any two paths such that it is possible 
to deform one into the other without passing through the origin. Now put |z| = r,argz = 6 
and take the path to be from t = 1 along the real axis to t — r, and then along a circle 
about the origin to z. On the first part of the path t is real and positive, and 


On the second part we put 




t = r(cosA-MsinA), 


and then 

dt = r( — sin A + i cos A) dA; 


then 

[*- = idX = W. 

Jr t Jo 


Hence 

logz = logr-f 

(6) 

In particular, if we make a complete circuit about the origin 



J z m ~ l dz = 0 

(7) 

for m 4= 0, since z m 

is single-valued; but if m = 0 



f — == [log r+ffi] = 2ni, 

Jo z 

(8) 


since log r is single-valued but 6 increases by 2n . 
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Log z is not single-valued without some restriction on the path of integration used in 
its definition; in other words, it has a branch point at the origin. In fact, if we maintained 
continuity and allowed z to make several circuits about the origin, logz would increase 
by 2 ni for each circuit, and thus would have infinitely many possible values differing by 
integral multiples of 2 ni. Our first example, z 1/a , had only two. We can, however, make 
log z single-valued by means of any cut used for z 1/a . The principal value of log z is that 
such that 1(log z) | <n, when z is not real and negative. 

Now apply these rules to integrate the expression in 11*11 (6) around a circuit including 
the pole at z = a and no other singularity. All terms in (z — a)~ n , with n different from 1, 
give zero. The integral of g(z) is zero by Cauchy’s theorem, since g(z) is analytic in the 
region. Hence 

f f(z)dz = ( ^ = 2mA v (9) 

Jc Jcz—a 

The integral of an analytic function about a pole therefore depends wholly on the 
coefficient of (z — a) -1 in its principal part. This coefficient is called the residue of 
the function at the pole. The characteristic feature of the method of evaluating definite 
integrals known as contour integration is to find a contour containing no singularities other 
than poles, the integral around which is equal to the definite integral sought, and then to 
equate the integral around this contour to 2ni times the sum of the residues at poles within 
the contour. There is no simple analogous rule for branch points and essential singularities, 
though one can be found for the latter in some cases by means of Laurent’s theorem, 
which will be proved later. 

11*114. Relation of the exponential and logarithmic functions. By direct 
multiplication we have 

exp (log z) = exp (log r) x exp id 

(since both factors are absolutely convergent series and the same argument applies as for 
real numbers) 

= r(cos 0 + i sin 0) 

= x+iy = z. (10) 

We can make use of the exponential and logarithmic functions to define z n for irrational 
and even complex indices; we take 

z n = exp {n logz). (11) 

This is single-valued and analytic in any region such that log z is. The verification that it 
ha-s a derivative equal to nz n ~ x may be left to the student. 

If a is real and positive 

a z = exp (z log a) = exp ( x log a) exp ( iy log a), 

| a z | = exp {x log a) = a x . 

11*12. Isolated and non-isolated essential singularities. An essential singularity 
a may be isolated or not. If it is isolated we can take a circle about a, with a radius not zero, 
such that a is the only singularity within it. Thus cosec (1 jz) has a pole whenever z = 1 jnn 
and n is an integer, and all these singularities are isolated in the sense that we can draw 
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a circle about each small enough to contain no other. But they have a limit-point at z = 0, 
and we see easily that the function is indeterminate and has no derivative at this point, 
nor has any function of the form z m cosec (1/z). Hence z — 0 is an essential singularity, 
and it is not isolated since there is another singularity within any circle about it, however 
small. On the other hand, exp cosec (1/z) has isolated essential singularities at all points 
z — Ijnn and a non-isolated one at z = 0. Functions can be constructed with non-isolated 
essential singularities at all points of a curve. The most important cases are those of 
single-valued functions that have been derived from many-valued ones by introducing 
cuts. 

If/(z) is analytic and single-valued in a region except for poles or essential singularities, 
and the singularities have a limit-point in the region, the limit-point is a non-isolated 
essential singularity. For if the limit-point is z = 0, any neighbourhood of z = 0 contains 
singularities ofjf(z), and therefore z = 0 is a singularity; and it is not a pole because, for 
any positive integral m, z m f(z) has singularities in any neighbourhood of z = 0. 


11*13. Cauchy’s integral. Let f{z) be analytic and single-valued within a contour C 
and continuous in the closed region. Then if z is within C, 



( 1 ) 


If f(t) is expressible as a power series in t — z this is obvious because by Taylor’s theorem 
the residue of f(t)/(t—z) at t = z is /(z). But we have not yet proved that an analytic 
function can be expressed as a power series and this theorem is one step towards proving it. 

The only singularity within the contour is the simple pole at t — z. If we take a small 
circle C' about z there is no singularity between C and C", and the integrals about C and 
C' are equal by the corollary to Cauchy’s theorem. On C', sine ef(t) is differentiable at t = z, 

M=f(z)+(t-z){f(z) + v(t)}, (2) 


where v(t)-> 0 with t—z. Then 




(3) 


The first term gives 2nif(z), the second 0, and the third tends to 0 as the radius of C’ 
tends to 0. But the left side is independent of the radius of C'; therefore it is equal to 
2mf(z), which proves the proposition. 


It follows that 


rw-mni ~ [ 

£->0 £ ZtTIJc \t — Z — £ t-z) 

m 


= lim 


r 


c ^o 2mJc(t-z)(t — z—Q 


dt 


m 


2ni Jc(t~z ) 2 


dt 


(4) 


since [ t—z [ has a positive lower bound on C. We can find similarly 


f (n) (z) 


n 


(*) = j_r _j_ 

! 2ni J c(t— 


m 


z) n +1 


dt. 


(5) 


and all these integrals exist. Hence Cauchy% integral can be differentiated under the 
integral sign. 
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This result is particularly important because in the argument leading up to Cauchy s 
theorem we deliberately avoided assuming that (f> and fr had continuous derivatives; we 
assumed only that they were differentiable and satisfied the Cauchy-Riemann relations. 
But starting from this assumption we have now shown that the first derivatives in these 
conditions are differentiable and therefore are continuous after all. We thus reach the 
end of a long story. 


11*14. Relation of Cauchy’s integral to power series. Now 

11 z z m z m+1 

tr-—Z = l + t %+ + t m+ 1 {t-Z)' 

Hence, if C of 11-13 is a circle, \t \ = c, where | z | < c and f{t) is bounded on C, 

i m C z n 

„ , > If j. 

where RJ - Z) = 2S J c P°+\t-z) 

Since z is within C , | t — z\ has a lower bound p > 0 when t lies on C. Hence 

where M is the upper bound of j f(t) | on C. Hence | R m (z) j ->0, and 


z 

m+1 M 

Me 

z 


— 27TC 

== - 

- 

c 

P 

p 

c 


m +1 




( 6 ) 

(7) 

( 8 ) 

(9) 

( 10 ) 


for all z within C. Hence, in the conditions stated, f(z) has a convergent expansion in a 
power series, whose radius of convergence is not less than c; and comparing with (5) we see 
that this series is the Taylor series for f(z) about z = 0. Consequently all results proved for 
functions defined by power series are true for analytic functions in general. 


11*141. Cauchy’s inequality. 


If we now write 


f(z) — 'Za n z n 


( 11 ) 


then for | z | < c | a n z n | = — zn j^ 


m 

t n + 1 


dt 


1 J I 

< — C n —T7 27 TC = M. 
2 tt c n+1 


( 12 ) 


Hence within and on a circle about z = 0 no term of the power series expansion of f(z) has a 
modulus greater than the maximum modulus of f(z) on C. This is Cauchy's inequality. 

If R is the distance of 0 from the nearest singularity or from the nearest point of the 
boundary of the region, whichever is smaller, then by 11-14 the series T,a n z n converges 
and is equal to /(z) for all z such that \z\<R. Further, R is the greatest value such 
that this is true whenever jz| < R. For if not, let it be true whenever \ z\<R’, where 
R'> R. Then either (1) the circle of convergence of I>a n z n contains a singularity of the 
function given by the sum of the series, which is impossible by 11-111, (2) for part of 
the circle the sum of the series is not /(z), or (3) the circle of convergence extends beyond 
the boundary of the region where f(z) is defined. 
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If /(z) is defined uniquely for all z except possibly for isolated singularities the theorem 
takes the simpler form: the circle of convergence of a power series passes through the 
singularity of the function nearest to its centre. 

Also, by comparison with 11*091 and 11*14, if two functions are analytic for [ z j < R, 
a necessary and sufficient condition that they shall be equal for \ z | < R is that they have 
the same expansion in powers of z for \ z \ < c for some c<R; and both functions will then 
be equal to the sum of the series for | z | <R. 

11*142. Liouville’s theorem; integral functions for large hi* if m is 
analytic over the whole plane, we can take C to be a circle of arbitrarily large radius c. 
Then if/(z) is bounded over the whole plane, say |/(z) | < M t 



which is arbitrarily small for n> 0. Hence 

f(z) = 

and therefore a function bounded and analytic over the whole plane is a constant. This is 
Liouville’s theorem. 

Conversely if a function is analytic over the whole plane (an integral function) it is 
unbounded for large | z \ unless it is constant. 

11*15. Analytic continuation. We have seen from Taylor’s theorem that if a 
! function/(z) is given by a power series in z, it can be represented also as a power series in 
■ z —z 0 , where z 0 is any point within the original circle of convergence, and this series will 
1 converge within any circle about z 0 that does not pass beyond the original circle of con- 
b vergence. It may, however, converge within a circle that does pass beyond the original 
circle of convergence. Take the function 

/(z) = 1+Z + Z 2 +..., 

and put z 0 = /(z) is already known to be equal to 1/(1 — z) for | z | < 1, and its derived 

series express the functions 

1! 2! 3! 

(1-z) 2 ’ (1-z) 3 ’ (1-z) 4 ’ *“* 

Hence the Taylor expansion of /(z) in powers of z' = z — is 

1 z' z' 2 

I-¥ + (I- r pj 5 + (T--iij 5+ '‘" 

By Taylor’s theorem we know that this series must converge and be equal to the original 
function if | z' | < since i is the point of the circle | z | = 1 nearest to \i. But we see by 
inspection that it actually converges if | z' | < | 1 — %i | = 5. This is what we should 

expect since by 11*141 the circle of convergence must pass through the singularity 
of the function nearest to the origin used. But the series considered might have repre¬ 
sented no function already known; in that case the new Taylor series would define values 
of an analytic function over a range of z where no function is defined by the original series. 





±±.±5 Analytic continuation 

Then we may be able to extend the range of definition further by taking a new Taylor 
series about a point in the new region. This process is called analytic continuation, and is 
fundamental in Weierstrass’s theory of functions, which takes the power series as the 
fundamental definition of an analytic function.* Weierstrass’s definition has the merit 
of being constructive', that is, we can assign any coefficients we like and always get a power 
series, the convergence and continuation of which we can proceed to study. Some func¬ 
tions arise naturally as power series, as, for instance, when we solve a differential equation 
by series. Some of the proofs, however, are more difficult with Weierstrass’s approach 
than with Cauchy’s. With the latter, however, we achieve little until we actually find 
functions satisfying the conditions stated for a function to be analytic; and what we find 
is that the most general analytic functions have power-series expansions. Hence the two 
theories are completely equivalent: but in practice our initial information about a func¬ 
tion sometimes shows that it satisfies Weierstrass’s condition, sometimes Cauchy’s, and 
therefore, strictly speaking, both developments are necessary to a complete theory. In 
practice, however, when continuation is required the direct use of Taylor series is laborious 
and seldom used. 

In the Weierstrass theory a function is defined at the outset only within the 
circle of convergence of the original power series. The function is then defined by 
the values given by the series together with all its continuations, which may pass 
branch points on opposite sides and thus give more than one value of f(z) for given z. 
But if f(z) is a rational function, not with a pole at the origin, the Weierstrass method 
would define it initially only within a circle extending to the pole of smallest modulus, 
and the process of continuation is needed before we can calculate it anywhere outside 
that circle. The method that we have adopted, on the other hand, enables us to 
calculate it directly from the fundamental rules for any value of z except the poles. 
Similarly, definitions by definite integrals are often directly applicable over larger 
regions than power series. 

The introduction of artificial barriers to replace many-valued functions by single¬ 
valued ones is an awkward feature of the present method, since the same object would 
usually be attained by many different cuts. It is possible either with the Weierstrass 
or the Cauchy method to dispense altogether with the use of cuts and to consider the 
function as a whole; we can choose one possible value of f(z) at z = z 0 , say, and con¬ 
sider how/(z) behaves if z varies continuously from z 0 to z lf f(z) varying continuously, 
or we can use the method of continuation by Taylor series. The value found for j{z-f) 
will then depend on the route chosen if /(z) is many-valued, and the route therefore 
m us t, be specified. This is done most systematically by the Riemann surface method, 
which replaces the z plane by a number of sheets winding into one another at the 
branch points. In this theory it is always true that there is a singularity on the circle 
of convergence, and the possibility of the series converging but not being equal to the 
function in part of the circle does not arise; the sum of the series is always one value 
of the function. The general treatment is beyond the scope of this book. We have 
however a n um ber of cases, especially in Chapters 21 and 25, where special attention 
has to be given to different paths of integration in presence of one or more branch points. 
Fortunately in these cases the distinctions between the paths are fairly simple. 

* Harkness an d Morley, Theory of Functions', Hurwitz and Courant, Allgemeine FunJctionen- 
theorie. 
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The treatment of functions with branch points is particularly important in the use 
of conformal representation in Chapter 13, but there the principle of one-one corres¬ 
pondence between points of the two planes compared makes the introduction of cuts 
unavoidable. 


11*151. If two functions fi{z), f 2 (z) are analytic in a region D and equal in a region D' 
within D, they are equal everywhere in D. Take z 0 to be any point in D', and Z any other 
point in D. Then fi(z) = / 2 (z) within any circle about z 0 that does not reach a singularity 
or the boundary of D'. Now suppose z 0 and Z connected by a curve of finite length in D 
not reaching the boundary of D. The distances between points on the curve and the 
boundary of D have a positive lower bound 8. Hence we can choose points z t , z 2 , ...,z n _ x , 
z n = ^ the curve such that | z r — z r _^ | < 8, and n is finite. z 1 lies within the circle of 
convergence of the series representing/^^) and/ 2 (z) in powers of z - z 0 ; hence both functions 
have the same Taylor series in z — z l5 and z 2 is within its circle of convergence. Proceeding 
we can show in a finite number of steps that both functions have the same Taylor series in 
z ~ z n-x> an d its circle of convergence includes Z. Hence f x (Z) = f 2 (Z). 

It is not necessary to the argument that the functions should be known to be equal at 
every point of a region. It is enough that they should be equal at, for instance, an infinite 
number of points within a square, or even along a finite stretch of a straight line. For we 
can establish the existence of a limit-point z 0 by the method of successive bisection, and 
it is in a region where f ± (z) —/ 2 (z) is analytic. Take it as origin and apply the argument 
of 11*091, and it follows that h{z) = f t (z) -/ 2 (z) is everywhere zero for | z-z 0 | < 8. 

It is astonishing that so much can be inferred from a knowledge of the values of the 
function in a limited region, but we must remember, as for Cauchy’s theorem, the severe 
restrictions on the possible behaviour of the function imposed by the condition that it is 
analytic. We shall see under Fourier’s theorem that it is sometimes possible to extend the 
definition of a function outside the original range in a quite different way by assuming 
different properties outside the range. 

The condition that the set of points where h(z) = 0 must have a limit-point is essential. If, 
for instance, two functions are known to agree whenever z is an integer, their difference 
vanishes at an infinite number of points, but these have no limit-point, and the functions 
could differ by any multiple of sin 2 ttz. The condition that h(z) must also be analytic at 
the limit-point is also essential; for instance, if h(z) = 0 when z = 1/n it could be any 
multiple of sin77-/z. The additional information that h(z) is analytic at z = 0 removes this 
possibility. 


We have seen in 11*141 that any function f(z) defined by a power series must have at 
least one singularity on the circle of convergence. The process of continuation may lead 
to definitions all round a singularity a. If the result at a given z depends on whether we 
pass a on the side where arg z < arg a or arg z > arg a, a is a branch point. If it is independent 
of the route, a is a pole or an essential singularity. 

A type of application that we shall often meet is to the solution of a differential equation 
of the form 

F(z,r,r,o = o. 


where F is such that if £ is an analytic function of z in a region D so also is F. We may 
find by some special method that £ = f{z), where/(z) is analytic, satisfies this equation 
in a certain region D' of z, included in D. Then F{z,/"(z),/'(z),/(z)} is an analytic function 
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of z , identically zero in D'. It follows by continuation that this function is zero over the 
whole region D, and therefore/(z) satisfies the differential equation in D. 

11*152. It follows from the modified Heine-Borel theorem that if f(z), a function of the 
complex variable, has a power series expansion about every point of a closed region, a finite 
set of squares can be found, covering the whole region, so that every square, or the part 
of it that is not outside the region, lies wholly within the circle of convergence of the 
expansion of/(z) about some point of the square. 

11*16. Two other theorems with some resemblance to Cauchy’s inequality may be 
proved here. First, the real or the imaginary part of an analytic function has no maximum 
or minimum within any open region. For, if it has, let us take the maximum to be at z = 0 
and the contour G of 11 *13 to be a circle about 0. Then 

where we have put t = ce ie on C. Hence /(0) is equal to the mean value of/(z) = <fr + iijr 
on any circle about 0 and therefore neither nor xjr can be greater at 0 than at every point 
on the circle; similarly, they cannot be less than at every point of the circle. The extreme 
values of 0 and i]r in any region must therefore be taken on the boundary. If <J> is constant 
on C, then by the uniqueness theorem (6-074) <f> is constant within C; then by the 
Cauchy-Riemann relations ijr is constant within G. Hence/(z) is constant within C, and 
by analytic continuation (ll-15)/(z) is constant over the whole region. <j) and \Jr can have 
stationary points, but if they are maxima for some directions of displacement they are 
mi n i m a for displacements in some other directions, as we can see by considering the 
power-series expansions. 

The three-dimensional analogue is that if V 2 f> = 0 within and on a sphere, f> at the centre 
is the mean of the values over the sphere. (Cf. 6-092.) 

11*161. Maximum modulus principle. If f(z) is analytic in a region, |/(z) | has 
no maximum within the region; and if |/(z) | at all points of the boundary, | f(z) | < M 
at aU points of the interior unless f(z) is constant. Let z 0 be a point within the region, 
put z = z 0 + t, and take a small circle C, t = ce ie , within the region. Then, if asterisks denote 
conjugate complexes, 

CO 00 

/(z) = <j> + if = «,,+ Sa n *»; /* =f>~i^ = «* + 2a***»; | f(z) 1 2 =jf* 

and the series are absolutely convergent on C. If m, n are unequal integers, 

r 277 f*27T 

t m t* n dd ~ c m+n e {m ~ n)i6 dd = 0 

Jo Jo 

1 r27T ^ /»2tt / 00 

and therefore — J ^ ff*dd = — I ^ I a Q a% + £ a n a*t n t* n 

= I ®0 | 2 + S | l 2 C 2n > | a 0 | a = |/(2 0 ) | 2 - 

Hence |/(z 0 ) | 2 is not greater than the mean value of | f(z) | 2 on C, equality holding only 
if/(z) is constant. If/(z) is not constant it follows that for any z 0 not on the boundary 
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there are points z arbitrarily near z 0 such that |/(z) | > \f(z 0 ) j, and the upper bound of 
| f(z) | can be taken only on the boundary. 

Alternatively, if arg/(z 0 ) = a, take g{z) = e~ ia f(z). Then %(z 0 ) = | f(z 0 ) |,3fr(z 0 ) = 0. 
If we take any circle C about z 0 as centre, and fftg(z) is not constant, then, by 11*16, 
fftg(z) > | f(z 0 ) | at some point z x of C. Hence | f{z x ) \ > | g(z x ) | > |/(z 0 ) |. 

By applying the same arguments to l//(z), we see that \f(z) | cannot have a minimum 
at an internal point if 1 //(z) is analytic. 

The maximum modulus principle has considerable mathematical and also physical 
importance. It can be stated alternatively: if |/(z) | = M on a closed contour, and at 
some point within the contour |/(z) | > M, then/(z) has a singularity within the contour. 
If at some point within the contour \f(z) | <M, l//(z) has a singularity within the con¬ 
tour and/(z) will have a zero or an essential singularity. 

It shows also that in two-dimensional electrostatics the maximum electric intensity, 
and in hydrodynamics the maximum velocity, occur on the boundary provided there are 
no singularities within it. These statements can be extended to three dimensions. 

A curve of constant |/(z) | may have a node z v If so, we have that for two distinct 

directions through z v jr | f(z) | 2 = 0. But |/(z) | 2 is differentiable;* hence ^ | f(z) | 2 = 0 for 

any direction through z x and f'(z x ) = 0. (The case \f(z 1 ) | = 0 can be excluded. For if 
there was a continuous curve with |/(z) | = \f(z x ) | = 0, z x would be a limit-point of zeros 

d 

of f(z), which would therefore be zero everywhere.) Conversely, if/'fo) = 0, ^ |/(z) | 2 = 0 

for every direction through z v and a curve of constant | f(z) | through z 1 has a node if 
f(z) is not constant everywhere. 

If the curve |/(z) | = M passes through a node and has a loop within the region, the 
loop may be regarded as a closed curve by itself and the maximum modulus principle 
applies to it. 


11-162. Schwarz’s lemma. If/(z) is analytic and | f(z) | < M for | z | ^ c, and/(0) = 0, 
/(z)/z is analytic for | z | ^ c and |/(z)/z | < Mjc for | z | = c. Hence, for | z | < c, |/(z)/z | < M/c, 
M 

that is, | f(z) | < — | z |. This lemma is due to Schwarz, 
c 


11-17. Laurent’s theorem. Let/(z) be analytic and single-valued between and on 
two circles C and C", with centre 0, C' being interior to C. Take z between them and 

draw a small circle C" around z. Then is a function of t with no singularity in the 

t—z 


region bounded by C, C' and C" or on the bounding curves. Hence 

fffia-f P-dt, 

Jc't-z Jct-z J C" t z 

all contours being taken in the positive sense. But as for Cauchy’s 
integral the integral on the left is 2 nif(z). To evaluate that round 
C , we can expand in powers of z/t since 1 1 \ > | z |; hence it is 


iz4 

n=0 J C 


m 

t n+ 1 


dt. 



* As a function of x and y, in the sense of Stolz and Young. 
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On C', | z | > 1 1 1, and we can expand in powers of tjz\ the corresponding integral is 

-Xz~ n f f{t)t n ~ x dt. 

n=l J C 

Altogether /M = ^ l ^ *~/<‘> *]. 

This shows that in any region between two concentric circles that is free from singularities 
a function can be expanded in a power series including negative powers. 

11*171. Several important consequences follow. First suppose that/(z) has a singular 
point at the origin and no other within C. Then the integrals around C' are independent of 
the radius of C', by 11-055. If possible let m be an integer such that z m f(z) is bounded 
within some circle D within C', except possibly at the origin; that is, a, M exist so that 

| z m /(z) | < M (0 < | z | < a). 

Then if n^m+ 1 and the radius of D is b<a 

| J t n ~ x f{t) dt | = j J t n ~ x f(t) dt < 2nM 1 1 1 n ~ m ~ 1 1 1 1 (| 1 1 = 6) 

= 2rrM 6 n-m , 

which is arbitrarily sma ll since we can take b as small as we like. But since the integral is 
independent of b it must be zero. Therefore if z m f(z) is bounded within any given radius 
from the origin (except possibly at the origin itself) the Laurent expansion contains no 
terms in z~ jn with n > m. If m is the lowest integer (> 0) that makes this true, f(z) has a 
pole of order m at the origin. If it is true for m — 0, the whole of the negative powers 
disappear and we recover Cauchy’s expansion. We can prove nothing about the value 
of/(0) by n«ing Laurent’s theorem alone, since z = 0 is always excluded from the region. 
But if m = 0 and f(z) is continuous at z = 0, /(0) must be equal to the constant term in 
the Cauchy expansion and therefore/(z) is analytic at z = 0. We can therefore say that 
if m = 0 and /(z) is continuous, /(z) is analytic at the origin. Hence if m is a positive 
integer and z m f(z) is bounded in the neighbourhood of the origin, but z m x f(z) is not, f(z) has 
a pole of order m at the origin ,* and if f{z) is bounded and continuous it is analytic. 

Now this is the converse of the definition of a pole of order m. Therefore a necessary 
and sufficient condition for an isolated singularity at z — 0 of a single-valued function to 
be an essential singularity is that z m f{z) shall be unbounded near the origin for all positive 
integral values of m. a 

11*172. Behaviour near an isolated essential singularity. Again, iff(z) has a pole 
of order m, 1 //(z) has a zero of order m, and conversely. The condition that/(z) is not analytic 
at z = a, but 1 ff(z) is, is in fact often taken as the definition of a pole. Similarly,/(z) — c has 
a pole of order m, where c is any constant, and therefore l/{f(z) — c } has a zero of order m. 
Conversely, if the last function has a zero of order m,f{z) — c and therefore/(z) have a 
pole of order m. If 1 /{/(z) - c} is analytic at 0, and not zero, f{z) is analytic. Hence if/(z) 
has an essential singularity at 0, l/{/(z) — c} also has an essential singularity at 0. For if 
the latter function was analytic/(z) could have at worst a pole, and if l/{/(z) —c} had a 
polejf(z) would be analytic. But if l/{/(z) — c} has an isolated essential singularity at 0, it 
is unbounded near 0; hence in any circle about an isolated essential singularity f{z) must 
somewhere approach arbitrarily close to any finite value. It has been proved by Picard 
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that it actually takes every value an infinite number of times,* except possibly for one 
value that it never takes (for example 0 for f(z) = e 1/z ). We can, of course, in no circum¬ 
stances speak of the value at the singularity itself, since this is undefinable directly and a 
definition by any limiting process will depend on the method of approach. 


11 • 173. When an expansion of Laurent’s form exists about an essential singularity, say 


we have 


f(z) = 2 a n z n + 2 

n=0 m= 1 

f(z) dz = 2nia_ 1 , 

JD 


where D is any contour lying between C and C' and surrounding the origin. Thus the 
method of residues is applicable even to functions with essential singularities provided 
that the conditions for Laurent’s theorem are satisfied. 


11*174. It should be noted that, if there are several singularities within C", the expan¬ 
sion will alter for any change of C' that makes it pass over a singularity. Thus exp (1/z) 
has a single expansion in negative powers valid everywhere except at z = 0. But cosec (1 jz) 
and exp (cosec 1 fz) can be expanded only in a zone from \z \ — l/(n + l)n to \z \ = 1/nrr, 
and will require different expansions for every value of n. No power series, even including 
negative powers, can hold in the neighbourhood of an unisolated essential singularity. 

11 • 175. Iff(z) has no singularities except a finite number of poles, a l5 a 2 ,... and is bounded 
at infinity, f(z) is the sum of the principal parts at the poles together with a constant. Take 
small circles C v € 2 , ... about the poles and a contour C large enough to include all the 
poles; then if z is within C and outside C v C 2 ,.... 



The first integral is independent of C so long as C contains all the poles of f{t)J{t — z). 
Hence we can take C arbitrarily large. Then since f(t) is bounded for large t the integral 
is a finite constant. As in the proof of Laurent’s theorem the integral about each C r gives 
the principal part at the pole a r . This proves the theorem. 

It follows that a function with no singularities except poles, and bounded at infinity, 
is a rational function. For we have only to bring the principal parts to a common denomin¬ 
ator, and f(z) is the ratio of two polynomials, the numerator being of degree not higher 
than the denominator. 


11*176. Fourier’s theorem. A form of this theorem can be derived from Laurent’s theorem. 
Under the conditions for the latter all the/(f) f n_1 and f(t) f _n_1 are analytic in the region used, and the 
paths G and G' can therefore be replaced by a circle D, 1 1 1 = \z\. Put z = re id , t — re**; then 


r /*2jr 

I fit) t n ~ x dt = i I /(re‘^) e ni ^~ d) d<p, 
J D JO 






1 f 2 ^ 1 f 2 * 

f(z) = — | /(re f 0) d<f> + — 2 I /(re^) cos n(<p — 8) d<f>. 

2t7-Jo it J o 


* The proof is surprisingly difficult to complete. Cf. Titchmarsh, Theory of Functions, p. 283; 
Landau, Darstellung.. .Funktiontheorie, 1929, Ch. 7. 
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Functions with no continuation 

This is one of the simplest ways of finding the Fourier expansion; but it assumes that f(t) is analytic 
within a zone | h | < | < | < | h l> where | \ < | z \ < \ t 2 |, and this condition is not usually satisfied in 

the cases where we need the expansion. We shall obtain the expansion under more general conditions 
in Chapter 14. 

11*18. Functions with no continuation. It might be supposed that, since the most 
familiar power series can be continued by Taylor’s series beyond their original circles of 
convergence, this can always be done. This is not so. Consider the series 

f(z) = l+z + z 2 + z 6 +...+z ni +.... 

This has radius of convergence 1, and hence there is at least one singularity with modulus 1. 
As we approach z = 1 from inside, every term tends to 1 and the sum to infinity; and since 
a point where/(z) tends to infinity for at least one direction of approach must be a singu¬ 
larity, z = 1 is a singularity of the function. 

Now within any arc of the circle, however short, there are points where arg z is a rational 
fraction of 2 n. Put then 

^ .m 

z = rexp 2m — (r < 1), 

where m and n are integers. Then 

z nI = r n! exp2mm(?& — 1)! = r wI , 

2<«+l)! _ ^nl^n+1 — 1 = ifin+ 1)!^ 

and so on; and the series becomes 

f(z) = (l + z + ... + z< n_1)I ) + r n! + r (n+1)I + .... 

As r approaches 1 the first part tends to a finite limit and the rest to infinity. Hence there 
is a singularity at z = exp (2 irimln) and therefore in every arc of the circle. No Taylor 
expansion about an internal point of the circle can converge beyond the nearest 
singularity, and this is at the nearest point on the circle (since a limit-point of singularities 
is a singularity). Hence no continuation is possible. 

A still more peculiar case is Osgood’s series 

^ z“ w+2 

/( z ) = ( a n_j_ i) ( a n 2) ’ 

where a is an integer greater than 1. The circle of convergence is again | z | = 1, but not 
only does the series converge at all points of the circle, but its first derived series does. 
Yet we know from general considerations that there must be at least one singularity on 
the circle. We call this z 0 and consider f(z x ) and/(z 2 ), where 

I z i I = I z z 1 = r < 1> argZj = arg z 0 , arg z 2 = arg z 0 + Znk/a™, 

k being an integer. As for the last series, the terms up to that with n = m — 1 differ in the 
two cases, but the later ones are all identical except for the factor (z 2 fz 1 ) 2 . Consequently 
if we form the Taylor series about z 1 and z 2 and consider points on the respective radii 
there will be differences in the early terms, but the later ones, which determine the con¬ 
vergence, are in a constant ratio. Hence the Taylor series in z —z 2 is not convergent 
if | z—z 2 | > 1 — r. Since m may be taken arbitrarily large we have again the result that 
there must be a singularity in every arc of the circle of convergence, in spite of the 
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apparently good behaviour of the function there. We may compare the fact that 
z 2 log z and its first derivative vanish at z = 0, which is nevertheless a branch point of 
the function; but it is surprising that such behaviour should be possible over an entire 
circumference. The reason why the function must be considered non-analytie at | z [ = 1 
is that it is impossible to say what we mean by its derivative as | z | approaches 1 from 
outside because we cannot say what f(z) is outside. 

These cases are artificial, but are given here to, show that it is quite possible that an 
analytic function may be defined in part of the plane and completely indefinable outside 
it. Every point of the circle of convergence is a singularity. 


11*19. Abel’s theorem. If a power series 2a n z n is convergent at a point z 0 on its circle 
of convergence , it is uniformly convergent on the radius up to and including z 0 , and its sum 
approaches the sum 2& n Zo in the limit. It is not necessary that Ea n Zo should be absolutely 
convergent. The theorem follows by applying that of 1-1154 to the real and imaginary 
parts of the sum separately. The theorem is important because it often provides a way of 
summing conditionally convergent series that would otherwise be unmanageable. Thus 
consider the series 

log(l + z) = z-\z 2 + £z 3 -..., 


which converges on the circle of convergence except at z = — 1. Hence by Abel’s theorem 
we can put z = e ie directly, and get 

log (1 + e w ) = e w — \e 2ie + \c uo —...» 
and also = log {2 cos £0e 1/a<0 } 

= log (2 cos \0) + \iQ. 


Separating real and imaginary parts we have 


cos 9 — \cos 20 + £ cos 3# — ... = log (2 cos %0), 
sin0 — £sin20 + £sin30 — ... = \9, 


( — TT <0 <7T). 


This device of inserting powers of r (< 1) in the coefficients of a series to make it abso¬ 
lutely convergent is now usually known as Abel summation. It had previously been 
used by Euler, and is therefore also called Euler summation', but the latter name is 
now usually given to another method also due to Euler. 

The method can even be used to suggest a meani n g for a series in a region where it does 
not converge. For instance, we might obtain the series 


a cos 0 — \a 2 cos 2d + cos 39—... 


as the solution of some problem, but in the conditions of that problem a > 1 and the series 
has no definite meaning as it stands. Nevertheless, we may try the result of taking a < 1, 
when the series becomes 

${log (1 + ae i9 )} = log 11 + ae ie \ 

— \ log (1 + 2 a cos 9 + a 2 ). 

This suggests a meaning even when a> 1 , and a suggested form for the answer is often 
a great help towards obtaining a valid proof. In this case the justification would be 
completed if we knew that the function required was the real part of an analytic function 
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of the complex variable ae^ for all a; for then we could sum within the circle of con¬ 
vergence and apply the result outside by analytic continuation. 

11*20. Morera’s theorem. Suppose that for all paths L within a certain closed 
region the integral of a continuous single-valued complex function, not assumed analytic, 
between given termini z 0 = (x 0 , y 0 ), Z = (X, F) has the same value, and therefore depends 
only on the termini. We write 

jJ4>(x,y) + iijr(x,y)}dz = F(X, Y). 

Also J ^ {$(x, y) + iijr(x, y)} dz = F(X + £, Y + y), 

where U is a path connecting z 0 with Z + £, where £ is small enough for the straight line 
connecting Z with Z + £ to lie wholly within the region. Then L' can be taken to lie entirely 
within the region. Since F(X + £, Y -f- y) is unaltered by changes of the path L' so long as 
the ends remain the same and it continues to lie within the region, we can take L' to 
coincide with L from z 0 to Z and then proceed as a straight line to Z + £. Then if arg £ = Q 
we have for variations with argument 6 


lim \{F(X + g,Y+r,)-F(X,Y)} 
m-*o£ 


l C z+{; 

= lim (<f> + ii/r)dz 

m->o bjz 


= Y) + iiJr(X , Y); 


and this is independent of arg£. Hence F(X, Y) has a derivative in the sense of 11-03 
and is an analytic function of the complex variable Z. Hence (fr + ifr also is an analytic 
function. 

This theorem is a converse of Cauchy’s. 

One importance of this theorem is that it shows that the uniqueness of a complex 
integral for variations of the path involves the same sort of restrictions on the real and 
imaginary parts of the integrand as the uniqueness of the derivative for variations of 
direction. 

Another is that it provides an easy proof of the proposition that a uniformly convergent 
series of analytic single-valued functions is an analytic single-valued function. So far 
we have proved this only for power series. Let 


CO 


8(z) = 2 u n {z), 
»=0 


where the u n (z) are analytic single-valued functions of z. If the series is uniformly con¬ 
vergent in a region it can be integrated term by term along any path in the region; hence 

f S(z)dz = i; f u n (z) dz. 

J L n = 0 JL 

But since the u n (z) are analytic and single-valued in the region their integrals depend 
only on the termini. Hence the integral on the left depends only on the termini. Therefore, 
by Morera’s theorem, S(z) is an analytic single-valued function in the region. 


24-2 
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11*21. The Osgood-Vitali theorem. A much more powerful condition for the limit 
of a sequence of analytic functions to be itself an analytic function was given by 
W. F. Osgood* and extended by G. Vitalit and R. Jentzsch. Let {f n (z)} be a sequence of 
functions, each analytic in a region D; let \f n (z) | for every n and every z in D; and let 
f n (z) tend to a limit, as n->co, at a set of points having a limit-point inside D. Then in any 
region interior to D,f n (z) tends uniformly to a limit, and the limit is an analytic function of z. 

We take the limit-point to be z = 0. All f n (z) have power series expansions valid for 
| z | c, where c is any quantity less than the distance of 0 from the boundary of D ; that is, 

f n ( z ) = £ a n>p zP. 

p=0 

If | f n (z) I < M for | z I = c, then \f n (z) | ^ M, | a UyP z p | ^ M for | 2 | ^ c. Suppose that a nm 
(m ^ 0) does not tend to a limit as n -> oo but that a np does so for all p < m, if any. Consider 

m— 1 

g n (z) = z~ m {f n (z)~ 2 a n p zP}. 

p =o 


On \z \ = c, | g n (z) | < (m-f 1) Hence g n (z) is uniformly bounded with respect to 

n and z, and g n ( 0) = a n m . Then g n (z)-g n ( 0) is uniformly bounded (^ (m + 2) if/c w ), and 
is zero for 2 = 0; hence for | z | ^ c, by Schwarz’s lemma ( 11 * 162 ) 


Now if q > n 


1 9n( z ) “ »«,«| ^ (m + 2) M | z | /c m +K 
\ <ln,m ~ \ < 1 9n( z ) ~ <*n,m I + I 9 a ( z ) ~ %,m I + I 9n( Z ) ~ 9 q ( z ) \ * 


Take r so that (m + 2) Mr/c m+1 < 0 ), and z' such that 0< \z'\<r and such that f n {z') and 
therefore g n [z') has a limit; take n so that for all q > n 


\9n( z ')-9 a ( z ') \ <<a - 


Then for all q > n \ a n m —a q<m [ < 3<w, 

and therefore a n m has a limit, contrary to hypothesis. Hence a n m tends to a limit a m for 
every m. Also since for all n, | a n p | < M\cP, we have | a p | ^ M/c p , and the series La p z p 
converges in any circle 1 2 [ < c and defines an analytic function f(z). Also convergence of 
2 a n p z® is uniform with respect to both n and z for | z | ^ d < c. Hence for | z | < d 

= Za p z p =f(z), 

n->co 

and/ n ( 2 ) tends uniformly to f(z). 

Now take D r to be any region interior to D and including 2 = 0. Let S be the distance of 
the boundary of D' from that of D. Then D' can be covered by a finite set of overlapping 
circles of radius c < 8, in such a way that no circle meets the boundary of D, and the centre 
of every circle lies within at least one other. Let the centre of a circle C ± he within C 0 , 
where C 0 has centre 0. Then the conditions hold in the common part of C 0 and C x and 
therefore in the whole of and by repetition they hold in every circle of the set. 
Hence f n (z) tends to an analytic limit function/( 2 ) in the whole of D', and convergence is 
uniform because the set of circles is finite. 


* Annals of Mathematics, (2), 3, 1902, 25-34. 
f Annali di Matematica, (3), 10, 1904, 65-82. 
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The most difficult part of any application of the last part of 11-20 is to show that the 
series (or the sequence) considered is uniformly convergent. The Osgood-Vitali theorem 
establishes this in general subject only to the existence of a limit at an infinite set of points 
with a limit-point in the region and to the sequence being bounded. In practice the 
existence is usually established at all points of some region enclosed in D, which is a more 
stringent condition than that assumed by Osgood, and still more than that assumed by 
Vitali. 

The limit-function is not necessarily analytic on the boundary of D. 


EXAMPLES 

1. Prove that if we tried to define an algebra of number pairs by 11-01 (1'), (2'), (3') and 

yy' = (aa' + bb', ab' — 6a'), 
the commutative law of multiplication would not be satisfied. 

2. If z n is the (n- f 16)th root of unity, of least positive argument, and w n is the nth power of z n , 

prove that for n = 1,2,... the points w n proceed anticlockwise once round the unit circle. Determine 
how many are situated in each of the four quadrants. Also evaluate \w n — z n \ as a function of n 
and determine for what n its value is largest. (I.C. 1942.) 

3. Prove that if a sequence of complex numbers z n tends to a finite limit c( =f= 0), then l/z n -> 1/c. 
Prove also that if/(z) is analytic and/(0) 4= 0, then 1 /f(z) is analytic at z = 0. 

4. Show that, if r 0 and A are positive real numbers, and 

r n+l + ~~ = 2A, 

r n 

then the condition 1 is necessary for the convergence of the sequence {r n }; show that it is also 
sufficient in the case r 0 > 1, by. verifying that r n > 1 for every n, and 

K-c|< 

for a suitable c> 1. 

5. Given that the series Ea n z n has the radius of convergence 2, find the radii of convergence of 

Sa„z”, Sa n z ni , La n z”, 2(ai + a 2 + ...+a^)z n , 
where k is a fixed positive integer, and in the fourth series the numbers a n are positive. (M.T. 1942.) 

6. By considering the function exp {kf(z)} for suitable constants k, or otherwise, show that 

(i) if u is bounded for all z then/(z) is constant, 

(ii) if u < v for all z then/(z) is constant, 

where/(z) = u + iv is an analytic function of z and u and v are real. (Prelim. Exam. 1943.) 

7. What are the radii of convergence of the expansions in powers of z of the following functions? 

/•\ 2 ,.. v , sinz 

w ( 11 ) log—-, (iii) expfil-z 2 ) 1 /*}. (M/c III, 1928.) 

8. Derive an alternative proof of the maximum modulus principle from the fact that 
l°g |/(2) | = 9?log/(z) in a suitably defined region. 

9. If/(z) = a 0 + a m z m + R(z), where a m is the first non-zero coefficient after a 0 in the expansion 
prove that c exists such that for | z j <c, j R(z) |<£| a m z m |, and hence prove the ma xim um moduli 
principle. 

10. Prove that an analytic function whose only singularity is a pole at infinity is a polynomial. 



(I.C. 1939.) 
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11. If fi(*) = J e KX f(x)dx, 

where x is real and f(x) is real and 5*0, exists for every complex value of k (including k = 0), prove 
that Q(/c) is an integral function. 

12. If/(z) is analytic in a region and not constant, prove that the values of z where \ f(z) | = M 
do not form an arc with an end interior to the region. 

13. Comment on the following proof of Cauchy’s theorem. 


Put , i(A) = \f(Az)dz. 

Then = J c Z ^ (Az) dz = /x ^ (Az) = J c ^ (Az) dz = _ X‘ 

Therefore I =A/X, and by making A -* 0 we have A = 0. Finally put A = 1. 

14. If 2w n (z) is uniformly convergent on a rectifiable contour C, and «„(z) is analytic within C 
and continuous in the closed region, prove that 2« n (z) is uniformly convergent in any region interior 
to C. 


15. Prove that the real and imaginary parts of f(z), where/(z) is analytic within C, have bounded 
variation on any rectifiable path within G. 


16. If 


/W-J-.f 

J 2m J c z ~ a 


where C is the circle | z | = 1, find/(a), (i) when <f>(z) = 1/z, (ii) when <f>(z) = 5R(z). (M.T. 1940.) 

17. Prove that the radius of convergence of Sa„z” is the smallest limit-point of the set | a n |“ Vn . 
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CONTOUR INTEGRATION AND BROMWICH’S INTEGRAL 

‘Go round about, Peer Gynt!’ 

ibsbn, Pttt Gynt 

12*01. Description of method. This method of evaluating definite integrals is based 
directly on Cauchy’s theorem. We have had instances of it already in the proofs of 
Cauchy’s inequality and Laurent’s theorem. A simple example is the following. Take 


-f- 

Jo® 


dd 


(1) 


■ b cos 6 ’ 

where a and b are real and a>b> 0. The integrand is an even function of 6, and therefore 

e i9 dQ 


I= i r 2 * dd _ r 2 * 

2Jo a—bcoad Jo : 


o 2ae ie — b(e 2i0 + 1)* 


Put e i0 = z. 

As 6 increases from 0 to 2n, z moves round the cirole \z \ = 1. Then 

dz 


--if 

tjo 


bz 2 — 2az + b* 


( 2 ) 

(3) 

(4) 


where the path of integration is around the unit circle. But this is a closed contour and 
the integral is therefore equal to 2ni times the sum of the residues at any poles within it. 
There are two poles, namely, the zeros of the denominator, and their product is 1; write 


bat = a — *J(a 2 — b 2 ), bja = a+*J(a 2 —b 2 ), 
Then a is within the unit circle and 1/a outside. Then 

/__if_*_ 

^b)c{z-a) ( 2-1 /a) 

Near a the integrand has the form 

—-ry (——1-terms analytic at z — a\, 

CC “ 1 joe oc J 

and the residue is therefore (a — 1 /a)- 1 . Hence 

2ni 


(5) 





7T 


ib(a—l/ot) *J(a 2 — b 2 )' 

12*011. Now consider the rather more complicated integral 

dd 


Jo («- 


b cos 6) 2 


( 8 ) 


( 1 ) 


under the same conditions. 
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Examples of contour integration 

With the same transformation 

2 f zdz 
~ i ) c{bz 2 — ^az + b) 2 

2 f zdz 

~ ib*) c (z- a) 2 (z - 1/a) 2 


( 2 ) 

(3) 


again around the unit circle. But now z = a is a double pole and we must expand to get 
the coefficient of (z— a) -1 . Put 

z —a = z', (4) 

■“(«'“) ( 1 + 2 '(s + T7^) + -) 


(5) 


and the coefficient of z' is 


G-rG-**-)-e~re4 


( 6 ) 


This is the coefficient of z ,_1 when the integrand is developed in powers of z', and therefore 
is the residue. Hence 


-M-HH 


47r (2 (a 2 — 6 2 ) 1/2 ) -3 2a 
= 62 \ b j b 

7ra 

= (a 2 -6 2 )%* 


(7) 


12-02. The case when b > a may be used to illustrate the notion of the principal value 
of an integral. Referring to 12-01 (5) we see that in this case the poles {a ± i*J(b 2 — a 2 )}/b 
are complex and have modulus 1; the integrand therefore is unbounded on the suggested 
contour, just as it is near cos# = ajb in the original integral. In such cases the integral is 
strictly meaningless, but a related integral sometimes occurs in practice, though its use 
always needs special justification. For real variables, if f(x) has a simple pole at x = a 
we cut out a range on the path from a—htoa + h and form the integral over the remainder. 
If this has a limit when h tends to 0 we call the limit the principal value of the integral. 
The simplest case is that of the logarithm. If a and 6 are real and have the same sign, we 
can choose a real path from a to b without passing through the origin, and then 



But if a is negative and b positive the integral diverges at x = 0. But we can still define 



v T f ~ h dx C b dx ~| 

- £“„( log ^ +lo 4) - log ^ - log R- 
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Principal value of an integral 

This integral is called the principal value and is unambiguous. The condition that we must 
use the same h on both sides is essential; it will be seen that by taking the first integral 
from a to —h and the second from k to 6, we could make the limit anything we like by 
making h and k tend to zero in a suitable ratio. Principal values are often written as 
ordinary integrals, but this practice should be avoided. We see that if the complex variable 
was used we could complete the path by a semicircle from —h to + h about the origin, 
either above or below the real axis. The former, being described in the negative direction, 
would give a contribution —m; the latter, +m. According to the path permitted by any 
cuts made in the complex plane we should therefore have in this case 



The principal value is the mean of these alternatives. 

Similarly, if a path in the complex plane passes through a simple pole a we can define a 
principal value of the integral along the path by cutting out the part of the path within a 
small circle of radius h about a and then making Ti tend to 0. If we change the variable z to £, 
and dz/d£ is finite and not zero at the pole, the same device will define an integral in the 
£ plane, and the two will be equal. For if the circle in the z plane cuts the path at a — k and 
a + k', where \ k\ = \k'\=h 1 and that in the £ plane cuts the path at cl — k and cc + k', 
then if k and k’ tend to 0 so that k/k' -> 1 , k and k' will also tend to 0 so that kJk' -> 1 . 

In the case of the integral 12*01 ( 1 ), if b > a. 


v dd _ \ p f dz 

0 a — 6cos0~ i J c b{z — oc) (z—lla)* 

where we must cut out parts of the path within small circles about the 
two poles and then make the radii tend to 0. But we can still complete 
the contour by adding small semicircles about the poles and inside the 
unit circle. There is no singularity within this contour and the integral 
about it is 0. The two arcs tend to semicircles, and as they are described in the negative 
sense the integral on each tends to — iri times the residue. But the residues are, at a and 
1 /a respectively, 




1 1 1 1 

ia-l/a’ i 1/a — a* 


which are equal and opposite. Hence the integrals around the arcs together tend to 0, and 
therefore the principal value of the integral around the unit circle is 0. 

This device for defining a principal value succeeds only at simple poles. 


12-03. If f(z) = g(z)/h(z), where g(z) is analytic and not zero at z = a, while h(z) has a 
simple zero there, we can write 


m = 


9(a) 

(z — a)h'(a) 


+<!>(z). 


where <j)(z) is analytic at a; then the residue is obtained immediately as g(a)/h'(a). For 
multiple poles it is usually necessary to carry out the expansion as far as the term in 
(z — a) -1 , and this may be troublesome. 
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12*04. Let f(z) be analytic in a region except for poles, and consider the function 
f'{z)/f(z). If a is a pole of order m, 

f(z) = A(z- a)~ m {1 + $(z)}, (1) 

where <j>(z) is analytic near a and is zero at a\ and 

/'(z) = — mA{z— a) -m-1 {l +<f>(z)} + A{z — a)~ m <l>'(z). (2) 


Hence near a 


/'(*) 


m 


+ ^( 2 ), 


f(z) z-a 

where \fr(z) is analytic at a. If b is a zero of order n, we can write 

f(z) = B{z-b) n {l + x{z)}, 

/'(*) n 


and get similarly 


f(z) z-b 


+(o(z). 


(3) 

(4) 

(5) 


Hence the function/'(z)//(z) is analytic in the region except for simple poles at the poles 
and zeros of f(z ); and the residue at a pole of order m is — m, and that at a zero of order 
n is n. If then we take 


JLf f_M 
2 nijcfiz) 


dz 


( 6 ) 


around any contour C in the region not passing through a pole or a zero of f(z), its value is 
Sw-Xm, the excess of the number of zeros of f(z) within C over the number of poles, 
multiple poles and zeros being counted multiply. We notice that (6) can also be written 

[l°g/(z)]c> the brackets indicating the change of log/(z) when z completes the circuit C, 
°T ^[ & Tgf(z)] c . 

Similarly if h(z) is analytic in the region, 

2 f c h ( z > dz = 2 n M h ) ~ s m M a )- (7) 

In particular if h(z) = z the integral is the excess of the sum of the values of z at the zeros 
within C over the sum at the poles. 


12*041. Iff(z) — g(z) + h(z), where f(z) and g(z) are analytic on C, and if at all points on C 


\h(z) | < | g{z) |, then 

[arg/(z)] c = [argg(z)] c . 

(1) 

Put 

f(z) = (1 +k)g(z), 

(2) 

where | k \ < 1 for all z on 

G. Then 



arg f{z) - arg g(z) = arg (1 + k). 

(3) 


But 1 + Jc has a positive real part; hence —\n< arg (1 + k) < \tt. 

Now the changes of arg f(z) and arg g(z) on describing C are integral multiples of 2n. 
But since | arg f(z) — arg g(z) | is always less than \tt the difference of their changes is less 
than 7T and therefore must be zero. This is Rouchi’s theorem . 
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12*042. This theorem provides a proof of the theorem used in algebra, but incapable 
of being proved by purely algebraic methods, that an algebraic equation of degree n has 
n roots. Let the equation be 

f(z) = a 0 z n + a x z n ~ x +... + a n , 

where a 0 #=0. We can find positive quantities r v r 2 , ... so that 

I ®0 i I ®1 l> | ®0 1 ^*1 > I |> | ®0 I ^ I ®» I* 

Take r = r x + r 2 +... + r n . 

Then at all points of the circle \z \ = r, 

|a 0 z w | > | a x z n ~ x + a 2 z n_2 +... + o n |. 

Hence, by 12-041, if we proceed around this circle the changes of argument of f(z) and 
a 0 z n are equal. But arg z n increases by 2 twt; hence arg f(z) increases by 2 mr. But f(z) has no 
poles; hence it has n zeros within the circle. It clearly has none outside. 

12*043. Beginners sometimes argue that since 

sins = *- 3 ]+-!-... = ° 

is an equation of infinite degree, it must have an infinite number of roots. The result is 
correct, but the argument would apply equally to 

expz = 1 + 2 + 2 -, + ... = 0, 

which has no roots at all. The reason why the method of 12-042 breaks down for equations 
of infinite degree is that there is no term of highest degree to make a starting point, nor 
are there any r and n such that the nth term has a greater modulus than all others for all 
| z | >r, so that there is no comparison function satisfying the conditions of 12-041. If 
we apply 12-04 directly to expz we have 

_L.f f *_ 0 

27 Tijcf{z) 2m Jc 

whatever path we choose; and therefore exp z has no zeros and no poles. 

12*05. Inverse functions. Suppose that we have an equation 

£ = /(*) (i) 

giving £ as an analytic function of z within a certain region of z. Then conversely we may 
regard z as a function of £, say z = p(£), at a certain set of points, namely the values of 
f(z) taken for values of z in the region of z. The question is whether p(£) is an analytic 
function of £, and in the first place whether it is single-valued over a region of £. If 
fi z i) = f( z z)> an d we take a path from z x to z 2 within the region, £ describes a contour, since 
by hypothesis f(z) is analytic and therefore bounded on the path. Then when £ describes 
this contour z does not return to its original value; hence p(£) has a branch point within 
the contour, by 11-11 (a). For instance,/(z) = z 2 takes the same value for z — a and —a. 
If z = ae ie with a constant and 6 increasing from 0 to 7 r, arg £ increases by 2n but z does 
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not return to its original value. Hence, since a is arbitrarily small, £ = 0 is a branch point 
of _p(£) = £ 1/a . Again, if f(z) = e z , and if y varies from — n to 7r, x remaining equal to a 
constant a, | £ | remains constant at e a and arg £ increases by 2n, so that £ describes the 
circle | £ | = e°. Since a may be indefinitely large and negative, every circle about £ = 0 
contains a branch point of the inverse function log £, and therefore log £ has a branch point 
at £ = 0. Hence if _p(£) is to be single-valued we must be prepared to make such cuts in 
the z plane that f(z) never takes the same value twice in any portion. Functions that 
take no value more than once in a region are called simple or schlicht in the region. 

Even if this is done it is not obvious that the set of values of £ determined by (1) constitute 
a region; this requires a theorem, which we shall proceed to prove, that in certain conditions 
every value of £ over a neighbourhood of £ is a value of f(z). 


12-051. Iff(z) is analytical z = 0, and iff( 0) = 0, /'(0) 4= 0, there is a region of the £ plane 
about £ = 0 such that the equation f(z) = £ has one and only one solution p(Q that is analytic 
in £ and tends to 0 when £ tends to 0. ///'(0) = 0, £ = 0 is a branch point of p(Q. 

Since f(z) is analytic at z = 0 (and therefore in a neighbourhood of z = 0 by definition 
(11*11)) there is an R such that 

f( z ) = a 1 z + a 2 z 2 +... ( z<R ). (1) 

Take a x = j= 0. a 2 + a z z -(-... is bounded in any circle | z | <c< R. Put 

f(z) = a x z + h(z). (2) 


Then, for some M , | h(z) \ < M \ z 2 |; take d so that 0 < Md < \ | a x |. Then for 0 < [ 2 | < d 

\f(z) — a x z | < M | z 2 | ^ | z | Md< \ \ a x z |. (3) 

For given £ the number of roots of f(z) = £ within the circle C (| z | = d) is 


1 

2ni 




= 2 ^ t ar g {«12 + h(z) - Oh- 


( 4 ) 


Let | £ | < \ | a x d \. Since | h(z) \ < \ \a x d \ at all points of C, | a x z | > j h(z) — £ | on G, and 
the number of roots of f(z) = £ within C is the same as those of a x z = 0, that is, 1. If £ = 0 
the root is z = 0. Hence z is a single-valued function of £, determinate for all £ satisfying 

m<tKK 

■ tf(t) 


Now consider the integral 




o/W-C 


dt. 


(5) 


The only pole of the integrand is where/(£) = £, and the residue is the value of t at this point, 
that is, z. Hence the integral is 2 niz. The integral is an analytic function of £; hence for 
| £ | c | \a x d |, z is an analytic function of £, equal to 0 when £ = 0. 

If a x = 0, let a n (n> 1) be the first non-zero coefficient in (1). Applying the necessary 
modifications to the argument we find that there is a 8 such that for any £ satisfying 
0 < | £ | < 8, there will be n values of z, tending to 0 with £, that make/(z) = £, and therefore 
z is not a single-valued function of £ in any neighbourhood of £ = 0. This completes the 
proof of the theorem. 
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If f(z) = « 0 for 2 = z o> corresponding results hold, as we see by considering f(z) -a Q and 
z-z 0 . 

12*052. If/'(0) =# 0, z is expressed as an analytic functionp(£) over a certain neighbour¬ 
hood, in which /{p(£)} = £. If p(£) can be extended by continuation, this relation will 
hold over the whole region attainable by continuation, and therefore the continuation 
remains the inverse function. It is desirable to say something about the positions of its 
singularities, but a complete account is not possible. 

We have seen that if /'(z 0 ) = 0, /(z 0 ) is a branch point of p(£), as for f(z) = z 2 near 
z — 0. 

Also a value £ 0 not taken by f(z) obviously cannot be part of a region where p(£) is 
analytic. For instance, if/(z) = e z , the excluded value zero is a branch point of log£. 

In any neighbourhood of a pole or an isolated essential singularity of f(z), |/(z) | is 
unbounded. Then we may consider 

r = 1 // 00 . 

If /(z) has a simple pole at z 0 , d£'/dz is not zero; hence in a neighbourhood of z 0 , z is an 
analytic function of £' and therefore of l/£. Thenp(£) is analytic, tending to z 0 , for values 
of £ outside a sufficiently large circle. But if /(z) has a multiple pole, p(£) outside any 
circle will be many-valued. If /(z) has an isolated essential singularity at z 0 , any value 
except possibly one is taken infinitely many times. Hence a function with such a singularity 
cannot have a single-valued inverse unless we introduce cuts. If f(z) never takes a 
value £ 0 , then £ 0 is a branch point of p(£). If /(z) takes all values in a neighbourhood of z 0 
there is at least a branch point at £ = oo, since values of/(z) of arbitrarily large modulus 
occur infinitely many times. 

General rules for the neighbourhood of unisolated essential singularities would be 
extremely difficult to state. 

If near a branch point of/(z) 

£ = /(*) = z llm {a Q + a x z+...), 

with m a positive integer, we can put z^ m = Z. Then Z, and therefore z, have not branch 
points at £ = 0. In this case, though we need a cut to make/(z) single-valued, nevertheless 
if we ignore the cut and allow z to proceed many times around 0,/(z) cannot repeat a value 
until m circuits have been completed, and then repeats the initial value only for the initial 
value of z. 

If however /(z) = z m i n (a Q + a 1 z+...) 

where m, n are positive integers > 1 and prime to each other, then the inverse function 
has a branch point at £ = 0. 

More complicated behaviour of f(z) near a branch point will necessitate special 
treatment. 

If z tends to infinity, /(z) may or may not tend to a limit, and the limit, if it exists for 
different paths, may or may not have the same value for all paths. Then for a corre¬ 
sponding path for £, p(£) tends to infinity, and the limit of/(z) corresponds to a singularity 
of p(£). The easiest treatment is to put z' = l/z,/(z) = gr(z'), and to consider the inversion 
of £ = g{z') near £ = gr(0). 
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Lagrange's expansion 


12*053. Lagrange’s expansion. This provides a formal development of the power 
series of expansion of the inverse function, and an extension gives one of an analytio 
function of the inverse function. In the integral 12*051(5) we can replace the circle C 
by a contour D within the region where /(z) is analytic, and D can be taken as large as we 
like subject to f(z) not taking the same value twice within or on it. Then in 12*051 (5) put 
f(t) = r and take | £ | less than the smallest value of | r | on D. Then if A is the corresponding 
path in the r plane 


2 = P(0 = 


1 C tdr 

2ni J a t — £ 



dr = 


00 


w=0 


( 1 ) 


Here a 0 = 0, and for n ^ 1 


'' n ~ 2m J A r ** 1 _ 2ni J D {f(t)} n+1 

= _Lr_ 

2m L n{f(t)} 


i r dt 

n jD + 2mnJ D {f{t)} n ' 


( 2 ) 


The integrated part vanishes because f(t) is single-valued. Hence a n is 1 Jn times the residue 
of {/(<)}“” at t = 0. This form is originally due to Jacobi. Specification of the radius of 
convergence of the series may be difficult; by 11*141 it is at least equal to the value of 
| f( z ) | at the singularity of p(£) of smallest modulus, but this rule needs care, partly on 
account of the treatment of cuts. If we take curves C m enclosing z = 0 such that 


|/(z)|=ra> 0 , 

such curves are simple for m small enough; and if m' < m, C m will enclose C m >. If we increase 
m, C m may reach a point z = b where f\z) = 0 . Then C m has a node at 6 , and for larger m 
will open at b and form a bulge, and values of f(z) near f(b) will be taken twice or more. 
Then |/(6) | will be the radius of convergence. But this does not exclude the possibility 
that | f(z) | may have smaller values at other stationary points. Consider for instance 

£=/(z) = z(z- 1 )*. 

This is stationary at z = \ and z = 1 . At the latter point/(z) = 0 , but the radius of con¬ 
vergence of the solution of/(z) = £ that vanishes with £ is | /(£) | = -£j. If we try to continue 
this solution to the neighbourhood of z = 1 we must cross the closed curve about z = 0 
specified by | £| = and the solutions z = 1 ± £ 1/a + ... that make £ small near z = 1 
correspond to different branches of the inverse function. 

The coefficients can be evaluated by putting 



where <fi(z) will not be zero within D. Then 


, = J_ f {<f>(t)} n dt 

n 2mnJ d t n 


(3) 


( 4 ) 
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But from the formulae for Taylor coefficients in 11*13 




(5) 

whence 


(6) 

and 


(?) 


This is Lagrange's expansion. We hay© taken z — 0 as starting point. If we have 


f(z) = a when z = 

a, and 



„ z — a 

£ a ~m’ 

(8) 

we have the more 

general form 



00 1 r d 11 - 1 “1 




(9) 


convergent within a circle in the £ plane such that 


dz-a <f>(z)-(z-a)<f>'(z) 
dz <f>(z) {(j>{z)Y 

never vanishes within it. 

It is unusual for the general coefficient in the inverted series to be expressible in a con¬ 
venient form.* But if the radius of convergence is known it guarantees the existence of 
the expansion within a definite region, and extends to analytic functions the theorem of 
inversion of a monotonic function given in 1*066. 

If h(z) is another function of z analytic in the region it is a simple extension to find its 
expansion in terms of £. We have only to put h(t) for t in (1), and 


h(z)-h(a) = 2 c n (£-a) n , 

n=*l 

where c » = 

c n is 1/n times the residue of {/(z) —f(a)}~ n h'(z) at z = a. 


( 11 ) 

( 12 ) 


12*06. Mittag-Leffler’s theorem. Suppose that f(z) has an infinite number of 
poles and no other singularities. We wish to know whether it is possible to extend the 
result of 11*175 and say that/(z) differs from the sum of the principal parts at the poles 
by a constant. We cannot assume the extra condition used in 11*175 that/(z) is bounded 
for all | z | > R, where B is fixed, for f(z ) is infinite at some points outside any given 
contour. But it may be possible to choose a set of contours C lf C 2 , ... such that on every 

* They are given in detail up to the coefficient of z 12 in terms of the coefficients in the series expansion 
of f(z) by W. E. Bleieck, Phil. Mag. (7) 33, 1942, 637-8; cf. also W. G. Bickley and J. C. P. Miller, 
Phil. Mag. (7) 34, 1943, 35-6. 




384 


Applications of Mittag-Leffler's theorem 


12061 


C m , | f(z) | < M, where M is independent of m and \ z\>R m , where P m -*oo, and such that 
dz 1 

< 27 tA, where A is fixed; that is, we choose the contours to pass between the poles 


. 


and 


ultimately to become indefinitely large. Then if t = z is not a pole of f(t), and €„ 


encloses z, 




(1) 


where P m {z) is the sum of the principal parts at all poles a within C m . But also, if 
f(t) is analytic at t = 0, 


jLf /(*)* = 1 f jm_ dt 

2mJ Cm t-z 2mJ Cm t 2ni) c J(t-z) ’ 


cj{t-z) 

and the first term on the right is/(0) —P m (0), by (1). Then 


/(*) =/(0)+{P^)-P.(0)} + A J jg*. 


( 2 ) 


(3) 


As we take larger and larger we include more and more poles and add more and more 
terms to the sum, and then the integral gives a remainder term. If it tends to zero the 
sum will therefore converge as R m oo and be equal to f(z). But this is true; for 


/. 


zf(t) 


dt 


M\z\2ttA 

R m -\z\ 


Then 


c m t(t-z) 

f(z) = /(0) + lim {P m (z) - P m ( 0)}. 


(4) 


(5) 


If a function analytic at the origin has no singularities other than poles for finite z, and if 
we can choose a sequence of contours C m about z — 0 tending to infinity, such that \f(z)\ 
never exceeds a given quantity M on any of these contours and J | dz/z | is uniformly bounded 
on them, then (5) holds; where P m {z ) is the sum of the principal parts of f(z) at all poles a, 
within C m . If there is a pole at z = 0 we can replace f( 0) by the negative powers and the 
constant term in the Laurent expansion of f(z) about z — 0. 

It may not be legitimate to write (5) in the form 

f(z) = {/(0) — lim P„(0)} + lim P m (z); 
for (5) may converge without either (P m (0)} or (P m (z)} converging. 

12*061. As an example, take f(z) = cosecz— 1/z. (1) 

This is analytic at z — 0, since we can define 


and then 


m 


, v z — sm z 

f( 0) = hm ;-= 0 

J z ^o zsinz 


.. f(z)—f{ 0) v z — sinz 1 

= lim ^ ' = hm— 5 —;— = -. 


»-»■ o 


z 2 sinz 
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It has simple poles at z = ± mr for all integral n, and 

cosec (nn + z f ) --—> = -— -•+analytic terms; 

' mr + z z ooann 

thus the residue at nn is (— l) n . 

Now sin ( x + iy) = sin x cosh y + i cos x sinh y, 

| sin(a; + iy) | 2 = sin 2 x cosh 2 y + cos 2 x sinh 2 y 
= sinh 2 y + sin 2 x. 

It is easiest to see that/(z) is bounded on a suitable series of contours by taking squares 
G m with their sides x = ± (m + £) n, y = ± (m + £) n, where mis an integer. J | dzjz | around 
C m is 8 sinh -1 1. On the sides parallel to the y axis 

| sinz | | sin (m + n | = 1, 


and on those parallel to the x axis 

j sinz | sinh (m +tt | ^ | (m+%)n\. 

Hence /(z) is bounded on C m for all m and the conditions for Mittag-Leffler’s theorem are 
satisfied. Hence 

cosec z —- = 2 (—l) w (— -( 2 ) 

z \z — mr nn) 

the summation being over all positive and negative integers n, not including 0 . If we 
combine equal and opposite values of n, 

1 00 2z 

cosecz =-S( —- 2 * 

z i wn* — z £ 

12*062. Again, take /(z) = cotz — 


(3) 

(4) 


The sum given by the formula is, since all residues are 1 , 

v 2z 

n =ltt 2 7r 2 -Z 2 * 

To justify it we take the same series of squares C m . We have 



(5) 


tanz = 


| cotz | 2 = 


tan x+i tanh y 
1 — i tan x tanh y ’ 

1 + tan 2 x tanh 2 y 
tan 2 x + tanh 2 y 


= tanh 2 y + 


1 — tanh 4 y 
tan 2 x +tanh 2 y’ 


When x = ± (m + |)tt this is equal to tanh 2 y and less than 1. When y is large, tanh 2 y is 
nearly 1 , and | cot z | 2 is nearly 1 for all x. Hence the conditions are satisfied and (5) is true. 

25 


IMP 
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12*063. By transformation or by similar arguments (not necessarily with the same 
choice of squares) we can obtain the following: 


6° 1 

tanz = 8z2 —-———— 

i (2n— l) 2 7r 2 — 4 z 2 

(6) 

* (-l) w - 1 (2w-l> 

Secz = 4^ —-rrs ^ -jV , 

i (2n— l) 2 7r 2 — 4 z 2 

(?) 

1 * 1 

cothz = - + 2z 2 - 0-0 ---a» 

Z i n 2 n 2 + z 2 

(8) 

1 ( —l) n 

cosech z = - + 2z S ., 

z n 2 n 2 + z 2 

(9) 

cceb -_ 4?- (-i)»- I (2«.- 1)^- 
i (2w-l) 2 7r 2 + 4z 2 ’ 

(10) 

tonhZ = 8Z ?(2»-l)^ S+ 4z^ 

(11) 


These series are all uniformly convergent in any bounded closed region excluding all 
poles. Thus if r is the largest value of | z | in a range and we take n in the cotangent series 
greater than r/n, the nth and succeeding terms can be written 


-2r(—-—- 
\n 2 ff 2 — z 2 r 


+ ■ 


■4* 


4 


(n +1) 2 n 2 — z 2 r 

and the moduli are less than the terms of the absolutely convergent series 

1 


2 r 


( —L. 

ynhi 2 — 


r 2 ^ (n+ 1) 2 7 j 2 — r 2 


+ 




For the series for sec z and sech z, which are not absolutely convergent, we take the terms 
in pairs and apply the extension of the M test in 1-1152. Hence we can integrate all these 
series term by term. Then 


and therefore 


Similarly 



■-) 


dz 


00 / z 2 \ 

8m2 nod 1 

[logsecz]* = — Slog(l — 

TT l^ 422 \ 

“ill ( 2 n-l) 2 n 2 )- 


cosz 


( 12 ) 


(13) 
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12*07. The Bernoulli numbers and polynomials. The method of residues gives 
a convenient way of connecting these with trigonometrical functions. If we write 9*08 ( 6 ) as 


1 1 1 00 


(i) 


and regard a as a complex variable, the function on the left has simple poles of residue 1 
at all the points a = ± 2nin, 0 excepted, and is bounded on circles of radius (2n + 1)7T 
centred on the origin. It vanishes for a = 0 . Hence it is equal to 


r ( 1 M = £ 

n=T±oo \a—2nin 2ni-ir) n= i 


2a 


±rihr 2 + a 2 


_ n ?i r ?i4^(4^r 2 ) |a|< n)t 


( 2 ) 

(3) 


the accent meaning that n — 0 is excluded from the sum. Comparing with ( 1 ) we have 

2 


6 2 r=(-l) r - 1 S 


ba- ti — 0. 


In particular 


and therefore, since B r = r\b r 


,ti(4 nhr*y> " 2r+1 

b ‘ = i ^{ 1+ h + ¥* + -)’ 

6i = _ 85*( 1 + ^ + F* + -)’ 


1 + 22 +^ 2 + ... — 2 7T Z b z — 7T 2 B 2 


7T* 

" 6 * 


1+ i + h + - = ~ 87T% = = ** 


90’ 


1 + 26 + 36 + • • • ~ 32 tt 6 6 6 — ^ 7 t * B 9 


7T 


945 ‘ 


On account of the smallness of the second term we have, very nearly, for large r, 

2 2(2r )! 

b 2 r=(-l) r - 1 2^'> and B 2r=(-l) r - 1 22f^r- 
Now combining 9*08(6), (7) we can write 

^ I + i-i = E W) + 6 r }a- 


(4) 

(5) 
.(«) 

(7) 

( 8 ) 
(9) 

( 10 ) 

( 11 ) 


This function is bounded on circles of radius (2n+ \)tt for 0 < t < 1 and reduces to t at a — 0 . 
Hence it is equal to „ . . . 

’' finnit / _ 


t+ S e 2 

n— — co 


\a — 2nm 1 2mri) 




® / 2a cos 2nnt 

= *+ S h^iZ2T-2+; 


a 23 


n _i \ 4 ti 2 7 t 2 + a 2 nn(4:n 2 7T 2 + a 2 ) 

(—V 

\ 4 ?i 2 7 r 2 / 

- */-« 2 v 


sin 2nnt 


® ® 2 acos 2 ?wrf/ — a 2 \ r_1 

= 2 4n 2 7i 2 li^ 2 / 


n=l r«= 1 
00 00 j 
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Operational methods 

12*08 

Comparing with ( 11 ) we have 



Fir(t) + b 2r 

-Z< lr _, 2 coa 2 H jrf 
„-i ' (4 nW)' * 

(13) 

F'2r-l{t) 

_ “ “n 2 nwt 

n =i n7r(4:n 2 7r 2 ) r ~ 1 

(14) 

Wheni = 0 the right of (13) reduces to b^, and P 2 r ( 0 ) = 0 as we should expect. In particular 

iWO+a.-ss 1 

[cos 2 ^ + ^ 2 Cos 47 rt-i-pCos 6irt+ ...j, 

(15) 

m =i^| 

^sin 2M + sin 47 rt + sin 6rrt -f-.. .^, 

^ 2 6 3 d / 

(16) 

P 4 (i) + 6 4 = — ^^cos 277-i +^cos47r^ + ^cos67rt+...j. 

(17) 

As for the numbers b r these reduce with increasing accuracy to the first terms for large r. 


12*08. Operational methods and contour integration. Complex integrals can 
be used to interpret many operational solutions of problems relating to systems with an 
in fin ite number of degrees of freedom. We shall first show their relation to the simple 
operators jp-* of Chapter 7. 


Consider the integral 


1 f e 21 

2mJ c z n+1 


dz y 


( 1 ) 


where C is a closed path in the z plane enclosing the origin, n is a positive integer, and t is 
independent of z. On expanding the exponential function we see that the coefficient of 
z~ x is t n jn !; hence this is the residue of the integrand at the origin, and there are no other 
poles within G. Hence the integral is equal to t n jn\. 

It follows that 


Now let 


1 f d* l a, a n \ , . a n t n 

^Jo^r 0+ z + '" + ?r J=a ° +ai + "' + irr 

F(z) = a 0 + E 


( 2 ) 

(3) 


1 


converge when j z J = P; it will be uniformly convergent for all greater values of \z\. 
Then in the integral 

W-2sJ’oT'W*’ (4) 

where C is now a contour in the region where \z\> R, the integrand is a uniformly con¬ 
vergent series and we can integrate term by term. Hence 


1 f n, X J “ a n t n 

S—: —F{z)dz = a Q +Y i - JL T- 

2m Jc z „=i n! 


(5) 


This may be compared with the rule for interpreting the operational expression 


a„t n 


F(p) i = a 0 + s ~zrr = ( a o+^ a nP~ n ) 


n —1 n 


( 6 ) 
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which was shown to be valid provided that a positive quantity r exists such that the 
00 

series 2 a n£ n converges for | £ | < r. Changing £ to 1/z and r to 1/R we see that the con- 
»=0 00 
dition for the validity of (5) is that R exists such that 2 a n z ~ n converges when I z I > R. 

n = 0 

But this is precisely the condition assumed in deriving (5), and we can now express it by 
saying that F(z) is analytic at infinity. Hence if F(p) satisfies this condition 


F(P) 1 



(7) 


Hence the result of applying any operator so far defined to unity can be immediately 
translated into a contour integral. 

Several of our other rules can be derived at once. Thus if F(z) and 0(z) are both expres¬ 
sible in power series in 1/z, converging when | z | > R, they converge absolutely on G and 
can be multiplied to give another series in 1/z, which is uniformly convergent. Hence 

F(P) 0{p) 1 = F(z) 0(z) dz, (8) 

since on expansion we see that the two sides are identical. 

The partial fraction rule also follows. For if F(z)/G(z) is the ratio of two polynomials, 
G(z) being of the same degree as F(z) or higher, 

where P(z — a) is the principal part of F(z)/zG(z) at the pole z = a; a = 0 being in general 
one pole. There is no constant term since both sides tend to zero at z = oo. Then 

2 htSMf* =».!>*«—»**• < 10 > 


The pole at z = 0 gives a finite series of powers of t; while the general term in the con¬ 
tribution from z = a will be 



Am*'* 
(z —a) m 


2m J 0 ' £ m 


( 11 ) 


where £ = z — a, and the transformed contour C' now encloses £ = 0, since C encloses 
z = cl. But this is 1)!, which is the interpretation given by the partial 

fraction rule. 


12*081. The contour integral interpretation, however, will not give directly the 
result of applying an operator to any function other than a constant. We can verify 
consistency if the operand g(t) is of the form G(p) 1, where G(p) is an operator within the 
meaning of the definition; for then our rule of 7-054 gives 

F(p)g{t) = F{p)G(p) 1, (1) 

which can be consistently interpreted by the rule for the composition of operators. But 
this is not general. For we have seen that if G(p) fulfils the condition that G(z) is expansible 
in a power series in 1/z for | z | > R, g(t) is an integral function of t. We cannot therefore 
apply (1) directly if, for instance, the force on a dynamical system is of the form A(a + f)- 1 . 
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even if a is positive, much less to any case where g(t) is not an analytic function of t for 
real positive values of t. But we can appeal to the principle of superposition 

F(p) g{t) = 9(0) f(t) + jj(t - t) dg(T), (2) 

which will provide an interpretation if only F(p) satisfies the fundamental rule. So far, 
therefore, the contour integral interpretation is less general than the operational one, 
which will always give a solution in terms of a single integral when the problem depends 
on a finite number of linear differential equations with constant coefficients. 


12*082. Limits of operators. Many physical problems are stated in terms of 
partial differential equations, which may be regarded as the formal result of having a 
large number n of degrees of freedom and making n tend to infinity. In such cases for any 
finite n the operator F n (p) will satisfy the conditions; but the limit of F n (z) may not. In 
particular it may not be bounded for large | z |, and there is then no suitable contour C. 
A simple case capable of being treated purely operationally is the following. Consider the 
equations 

dx. n n 

i ii + h x ' = i x ■ 


r—l> 


where all x r = 0 (r ^ 1) at t = 0, and x 0 = g(t). Then for r ^ 1 

Take r = n and consider _ . , ... (ph , A~ n ... ... 

K(P)g(t)= \^; +!] 9(1), ( 1 ) 

where t and h are positive. F n (z) has a convergent expansion in powers of 1 fz for all 1 2 j > njh. 
We can therefore evaluate this as 



9(t) = 


t (f_ 

\ g(t - T) ^Y)\ e ^ dr 


rraih / v n -1 

"Jo T nj(n- 1)! 


e-^dv. 


( 2 ) 

(3) 


When n is large the integrand, apart from the g factor, has a sharp maximum near v — n—\. 
Hence if t is positive and greater than/i, and g(t — t) is continuous, the integral approximates 
as n increases to 


/; 


g(t-h) 


1 


(w-1)! 


e~ v dv — g(t — h). 


(4) 


Now when n-> 00 , F n (z)->ex-p( — zh). When an operator F n (p) is such that as w->co, 
F n (z) F(z), we shall speak of F(p) as the formal limit of F n (p). Then we have in this sense 

exp ( -ph) g(t) = g(t - h), (5) 


provided t and h are positive and t>h. We therefore obtain an interpretation of 
exp (—ph), although exp (— zh) is not expansible in powers of z~ x and exp(— ph) is 
therefore not defined in our fundamental rules. 
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If h > t (3) still holds, but the maximum of the factor v n - 1 e~ v is at v = n— 1, which for 
n > (1 — tjh)- 1 is outside the range of integration. The integral can then be shown easily to 
tend to zero. 

Hence the result (5) depends essentially on the condition t>h;i£t<h 

exp{-ph)g{i) = 0. (6) 

Thus this example of a formal limit of an operator leads to an intelligible result; but even 
if g(t) is an analytic function the result of the operation is not an analytic function, since 
its values for t < h are not the analytic continuation of g(t — h) for t>h. 

It is a common occurrence in mathematics for the limit of an infinite n um ber of opera¬ 
tions to give something fundamentally different from the separate operations. If is 
expressed as a decimal, the digits up to any finite stage express a rational fraction, but 
the complete decimal is not rational. In the theory of the complex variable, any finite 
number of applications of the fundamental rules can give only rational functions, with no 
singularities other than poles; but an infinite series and its continuations can define a 
function with essential singularities and branch points. Hence there is no occasion for 
surprise at the interpretation of exp ( —ph). But we must verify whether it commutes 
with p~ l . We have, writing exp { — ph) formally as e~ ph , 

e -pkp-ig(t) = e~ ph f g{r)dr = f g{r)dr , (7) 

Jo Jo 

p-i e -P h g(t) = jg(T — h)dT=j * g{r) dT, ( 8 ) 

so that the commutative rule holds if and only if g(t) — 0 for negative values of t; and if 
it is, all our operations on it will give 0 for negative t. 

If we take exp {ph) with h > 0 we are not led to the interpretation 

exp {ph) g{t) = g{t + h) (9) 


nor to any interpretation whatever. We have in this case 

which can be shown* to tend to no limit as n->oo. We shall illustrate this by a special 
case later (12-10). Consequently exp {ph) is not an operator admissible in the system. 

These results are intelligible physically. We ordinarily regard the state of a physical 
system at time t as determined by its state at time 0 and the disturbances that have affected 
it between times 0 and t. For systems with a finite number of degrees of freedom the opera¬ 
tional method makes direct use of this principle. But if exp (— ph) g{t) was equal to g{t — h) 
for h > t we should be saying that the state at time t is determined by disturbances before 
time 0 and not taken into account in the specification of the state at time 0. If exp {ph) g{t) 
was equal to g(t + h) for h > 0 we should be saying that a system can be influenced by dis¬ 
turbances that have not yet happened. Our results are therefore just what we should 
expect physically. The problem is to express them in our mathematical language. It is 
necessary to do so because when we treat the vibrations of strings and other problems of 

* Proc. Camb. Phil. Soc. 36 , 1940 , 274 . 
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Infinite instability 

sound by the operational method the operator exp ( —ph) makes an appearance. But it 
will not obey the commutation rule unless we restrict the operand to be 0 for t < 0, and 
though this makes no change in the solution for t > 0 it implies that the contour integral 
as defined above is not quite what we need. But a modification of the path of integration 
gives a result with the required property. 

The particular operators investigated above are far from the only ones that have 
exp (+ ph) as their formal limits. A criterion of consistency is therefore needed if exp (— ph) 
is to be satisfactorily defined. The failure to arrive at a satisfactory definition of exp ( ph) 
can be understood from the form of F n (p) in this case. It would correspond to a family 
of n differential equations of the form 

w-h—h-' 

with all the variables zero at t = 0. But the complementary functions of this set of equa¬ 
tions have the form i m exp (nt/h) and become meaningless if n tends to infinity. This will 
happen for any mode of approach to a limit such that at least one complementary function 
has the form exp (oc n t), where the real part of ot n tends to + co with n. We may speak of 
such sequences of systems as tending to infinite instability. Hence a necessary condition 
for a consistent definition of an operator by a limiting process is that the approach shall 
be through systems with no complementary function of this type. It is hard to invent 
reasonable sequences of physical systems that tend to infinite instability, but the 
principle that it is not approached in the limit turns out to play an important part in 
the justification of the application of the operational method to the solution of partial 
differential equations. 

12-09. Bromwich’s integral. The device introduced by Bromwich* is to replace the 
path G of 12-08 by one from c — icoto c + ioo, where c is real and positive and the path is so 
chosen that all singularities of F(z) are to the left of it. We denote this path by L. It should 
be recalled that for the complex variable, as for the real variable, the usual proof of the 
existence of an integral fails if the path has an infini te length. The integral is then defined 
by first considering termini at a finite distance, for which the existence of the integral 
can be proved; if the integral tends to a limit as the termini approach infinity the limit is 
taken as the definition of the integral over an infinite range. 

Now let F(z) be a rational function bounded at infinity, and therefore such that its 
numerator is of the same or lower degree than the denominator. We shall show that 
for t real 

±..jt mdz = ±..Jt mdz=f{t) (< > 0 ) ( 1 ) 

= 0 (*< 0 ). ( 2 ) 

12-091. Jordan’s lemma. For this we need a form of the theorem known as Jordan’s 
lemma. We first replace F(z)/z by (f){z) and impose on <p(z) the weaker condition that for 
any (0 we can choose r so that for any | z | ^ r and 9t(z) < c (c > 0), | <p(z) j < (o. Then for 
t>0we first take AB a finite stretch of L as shown and complete the contour by a rectangle 

* Proc. Lond. Math. Soc. (2), 15, 1916, 401-448. A similar device was introduced at the same time 
by K. W. Wagner, Archiv. f. Electrotechnik, 4, 1916, 159-193. 
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ABGD , the comers of which are c—iX, c+iX, —X+iX, —X — iX. We can choose 
X, so that | <p(z) | < w at all points of BG, GD, and DA. 


Then f e tz (p(z) dz — f e t( - x+iX) <f>(z) dx < to f 

J bo J e J -t 


e- te dx<j{e te -e- tx ). 

V 


Similarly 


Also 


J ■ 

J DA 


e tz $(z) dz 


to 


<-(e tc —e- lx ). 

v 


f e tz <j>(z)dz = f e t( '~ x+iv ' ) (j){z)idy 

J CD J x 


< 2 o)X e~ tx . 


( 1 ) 


(2) 


(3) 


Thus the integrals along BG, GD, DA are together less than 


2to^ e ci + Xer tx ^. 


(4) 


0 


B 


If then we choose an arbitrary e we can choose to < Jfee -0 *, and 
then X so that 2a)Xe~ lx <%e; and then the integrals together 

are <e. Therefore e iz <p(z) dz converges and is equal to the 
Jl 

contour integral around all singularities on the negative side 
of L. 

If t<0 and X>0, (3) may not be arbitrarily small, as the 
term e~ tx will be large for large X. But we can form a similar 

rectangle on the positive side of L, and the appropriate modifications of the argument 
show that the integral on L is equal to a contour integral including all singularities to 
the positive side of L, provided now that we can choose r so that for any \z\^r and 
\<f>{z) I <to. 

The argument still holds if <fi(z) has an infinite number of singularities on the imaginary 
axis. For we are not restricted to vary the path continuously, and if we can find a sequence 
of paths tending to infinity such that | <fi(z) | ->0 on them the theorem will still follow, as 
in the proof of Mittag-Leffler’s theorem. 

12*092. Heaviside’s unit function. F(z)fz of 12-09 satisfies the conditions imposed 
on <f>(z) in the proof of this lemma; hence (1) and (2) follow, and the use of the path L 
instead of the closed contour gives the same interpretations for positive t but zero for 
negative t, since there are no singularities to the right of L. In particular if we take 
F(z) = 1 the integral is 1 for positive t and 0 for negative t. We define as in 7-09 

H(t) =1 (t > 0), H(t) = 0 (t < 0), (1) 


and call H(t) the Heaviside unit function, since it occurs, written as 1, in many places in 
Heaviside’s writings. He also wrote pi for a function whose integral is H(t) and called this 
the impulse function. It is identical with the Dirac ^-function. We shall not have occasion 
to use this function, which requires a special type of integration to give a meaning to its 
applications. 

The unit function is discontinuous at t = 0. If we take the principal value of the integral 
we get but there appear to be no occasions when this is used. 

If F(z) is analytic and bounded to the right of L, and the integral exists, we define 

F(P)m) = 7^-j/^dz =f(t)H(t). ( 2 ) 
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12*093. Integration and differentiation of Bromwich’s integral. In 

F(p)H(t) -f(t)H(t) = ±j^F(z)<* (1) 

let us integrate under the integral sign from 0 to t; we have 

( 2 ) 

But if F(z)/z is analytic and bounded to the right of L we can write 



where L' lies entirely within the region where | F(z)jz | <o), and 0 ) is arbitrarily small. 
Hence this integral is 0 and 

p-y(l)B(l) = —j L F(z)e^, (4) 

which is our interpretation of p~ 1 F(p)H(t). Hence the integral is consistent with the 
interpretation of p~ x as meaning definite integration from 0 to t. 

If we differentiate (1) under the integral sign we get 

j t f{t)H{t) = ^j^F(z)e iz dz = pF(p)H(t), (5) 

provided now that F(z), and not merely F(z)/z, satisfies the conditions of Jordan’s lemma: 
if it does not the integral is meaningless. But if it does, take t — 0 and use the path L'; then 

lim/(«)#(«)--1 f ^*-0. (6) 

*->0 ^TTljL' z 

Hence the interpretation , 

j t f(t)H(t)=pF(p)H(t) (7) 


is valid provided that /(0) = 0, but not otherwise. Our restriction on the identification 
of p with differentiation therefore reappears in relation to Bromwich’s integral. 


12*10. If we interpret exp (- ph ) F(p)H(t) or F(p) exp (- ph)H(t ) by Bromwich’s rule 
we 8 et , , 

(1) 


-Lf 

2m], 


F(z) — dz = f(z — h) H(t — h), 

L Z 


which agrees with the interpretation obtained by treating exp (— ph) as the limit of an 
operator. If we apply the rule to exp (ph) with h positive we get f(t + h) H(t + h), 
whereas our limiting process showed that this operator is not uniquely definable. In one 
sense this is trivial because this operator never occurs in practice, but it suggests that 
further precautions are needed. Our limiting process gives an explanation. We tried 
to define exp (ph) as the formal limit of (1 —phjn)~ n ’, but then the F(z) of the Bromwich 
integral is (1 — zh/n)- 71 and has a pole at z — njh. Hence, however we draw the path L, 
there will be values of n such that the integrand has a pole to the right of it; and the limit 
of the operation on f(t) is not to be identified with the Bromwich integral along any path, 
since it is essential to the definition that all singularities shall be on the negative side. 
These conditions will clearly arise in any case where there is a tendency to infinite in¬ 
stability. 




12*10 Inadmissibility of exp ( +ph) 

To see what happens in this case, we have from 12-082 (10) 




zh\~ n .dz 
e * 1 —. 
z 
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(2) 


For t > 0 the pole at z = 0 contributes 1; a contour G about z = n/h contributes 

dz' 


—r y 

2m J c\ n ) 


gnilhgs't 


y'—riQZ't 


where z = nlh + z'. The residue of-— , ,, -at z' = 0 is 

1 *f- tiz jTi [n 

f n (t) behaves like 


n/h + z 
t n 




/ n \»-i 

-n . 


fn—l 

e nllh- -— = l+tt n ($). 


Then 


u. 


(3) 

Hence for large t, 

(4) 

(5) 


(71,-1)! 

, n (t) h n n h v \h J 

which is < -1 for sufficiently large t. Hence u n {t), and therefore f n {t), oscillate infinitely for 
sufficiently large t&an increases. 

i /* dz 

It will be true that F(p, n) H ( t) = J JF( Z > n ) eZl ~ (®) 

for L drawn so that all singularities of F(z, n) are on the negative side; it will also be true 
that 


Jim F(p,n)H(t) = lim 


-f 

2 mJ Ln 


F(z, n)e zl — i 


(7) 


provided that these limits exist, L n being chosen for each n in accordance with the above 
rule; but if a n is the largest real part of z at any pole of F(z, n) and a n ->oo we cannot 
necessarily invert the order of integration and make n-> oo. The result would be 


flimF(2,7i)]e^^, (8) 

2m J l ln->oo j z 

but this assumes a fixed path L, and we cannot draw any fixed L so that all poles of F ( z , n) 
are on its negative side for all n. Further, the limit (8) may exist, but (7) cannot for all 
initial conditions, since the function will in general ultimately increase like e®**, which 
cannot tend to any limit as a n oo. 

On the other hand, if the systems do not tend to infinite instability, inversion 
is possible in fairly wide conditions. For then we can choose k greater than the real 
part of any pole of any F n (z); and then we can take the c of the Bromwich integral 
greater than k, and use the same path L for all n. A proof has been given by D. P. Dalzell* 
(cf. 12-101). Consequently the operational methodfor continuous systems is now justified. 
We regard the differential equation as a formal limit of a set of ordinary differential 
equations in a finite number of variables, which can be solved by the operational method; 
and the formal limit of the operational solution is the solution for the continuous system 
when the number of variables is made infinite, and can be evaluated by means of Brom¬ 
wich’s integral. The formal limit of the operational solution can be found directly by 
forming the subsidiary equation from the partial differential equation and solving with 
the given boundary conditions. 

* Proc. Cavrib. Phil. Soc. 36, 1940, 276-9* 
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Continuous systems as limits of discrete systems 

This approach seems more satisfactory than the usual one, which takes as a funda¬ 
mental requirement that the partial differential equation shall be satisfied. In, for instance, 
the motion of a stretched string, the equations of motion are rightly applied to a finite 
length of the string since this has a finite mass. But if we proceed to the limit and speak 
of the acceleration at a given point of the string we are differentiating a function that may 
have no derivative. If, for instance, the initial conditions are that the string is drawn aside 
a distance tj at x = £, the string being straight at intervening points, the partial differential 
equation is meaningless at x = £, which is the only place where the initial acceleration is 
not zero. But if we replace the string by a set of particles uniformly spaced and with total 
mass equal to that of the string, we get an unambiguous solution, and the limit of this 
when the number of particles is made indefinitely large satisfies the equation for any 
finite length of the string, which is as much as we have any right to expect. A further 
consideration is that the actual string has an atomic structure and is really composed of 
a finite number of particles. The physical problem, therefore, is not the motion of a 
continuous string. The interest of the latter is simply that of an approximation, which 
for many purposes is good enough; but any question about the validity of a solution should 
be directed towards its accuracy for the discrete system and not towards the validity of 
the process used for solving the partial differential equation, since this equation is itself 
under suspicion as an expression of the physical facts. The process of making 8x tend to 
zero continuously is intrinsically meaningless for an actual string, because there are no 
particles to give the displacement a meaning within a certain spacing. Similarly in 
thermal problems it is meaningless to specify an absolute temperature within, say, 
1 part in 1000, unless something of the order of a million molecules are considered, 
and the differential equation of heat flow must also be regarded as a somewhat faulty 
idealization. 


12*101. Dalzell’s theorem. Let F{z,n), an analytic function of z, satisfy the 
following conditions: 

(i) For each n a finite R n exists such that for \z\>R n 

F(z,n) = a 0>n + i a -^. 


(ii) For all ?R(z)>k, F(z,n) has a limit F{z) as n->co. 

(iii) Positive values of M, k exist such that for all n, and for all 91(2) > k, [ F(z, n) J < M . 

(iv) In any finite interval of t, if c> k, possibly excluding a finite set of fixed intervals 
of arbitrarily small total length 8, 


J. 


c+iF 


F(z,n)e d dz 


<N 


for any real Y, where N may depend on 8 but not on n or t. We show that (v) 

F(p,n)H{t)->F(p) H(t) 

almost everywhere; and (vi) 

Jr Jr 

for all t. 

By the Osgood-Vitali theorem (11*21) it follows from conditions (ii) and (iii) that in 
any bounded region of the half plane dt(z)>k, F(z, n) F(z) uniformly, and F(z) is an 
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analytic function of z in the half plane, possibly with a singularity at infinity. Clearly 
also | F(z) | < M everywhere in the half plane. 

The function f n (t) = F(p,n)H(t) = -U f F{z,n)e a ~, (1) 

2 m J L 2 

where 91(2) = c>k on L, exists for all t 4= 0 by (i), is 0 for t< 0 by (i) or by (iii) and 
Jordan’s lemma, and is a continuous function of t for t > 0, by (i). Also if z = c + iij, 

1 c — irj 

z c 2 + vj 2 ’ ^ 


and 91 ( 1 / 2 ) and $( 1 / 2 ) for | ij | >c are monotonic functions tending to zero as j rj | ->oo. 
Hence by Dirichlet’s test, in any interval where (iv) is satisfied, for any a> we can 
choose Y so that 

1 1 / i c— ijr rcriw H»\ 1 

<0) (3) 


1 / fc-iY rc+iaa l„\ 


for all n; further F{z,ri) e zt jz^F{z) e d /z uniformly in — Y < $( 2 ) < Y. Therefore 


F(p,n)H(t)^±-. = mm) =/(<), (4) 

say; and f(t), being the lim i t of a uniformly convergent sequence of continuous functions, 
is continuous in any such interval. 

Next, p- 1 fn(t)=f‘fn(T)dT=±j i 'F(.z, n )e*‘^ (5) 

by 12-093, since F(z,n) has no singularities for 91(2) > k; and 


MS7Z + SlZ nz ' n) ^)H 


Me* 


V(c 2 + Y 2 ) 


<(» 


( 6 ) 


if Y > Me ct ln(i). Also F(z, n) e*/z 2 -+F(z) e*/z 2 uniformly for — Y < $( 2 ) < Y ; and therefore 


in any interval of t 


p-'fnit) 


2m J, 


L 2 


(7) 


and the right side is a continuous function of t. 

But in any interval where the sequence {/„(£)} is uniformly convergent 

jj(t) dt = 2^J l F ( z ) (e*-**) ^ = lim , (8) 

and therefore the right side of (7) has derivative f(t) except possibly at a finite set of 
values of t in any interval of t, and it is continuous at these values. It can therefore 
be denoted by p~ x f{t). 

Of the conditions stated, (i) is implied by the general principle that we proceed 
through a sequence of discrete systems, and (ii) is obviously necessary, (iii) expresses 
the necessary condition that the sequence of systems does not tend to infinite instability. 
A less severe condition could be sufficient, but not much less. None of the functions 

2 2 z 2 nz 2 

w i/a ( z 2 _j_ » z 2 + n 2 ’ z 2 + n 2 

is uniformly bounded in the half plane, as we see by taking z = in+ c. For the first, 
both (v) and (vi) are true; for the second, (v) is false and (vi) true; for the third, both 
are false. The exclusion of a set of measure <t from (iv) allows us to deal with a sequence 
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that would tend, say, to sech ph. Such sequences arise in wave problems, and the dis¬ 
continuities are an essential feature of the solutions. 

Dalzell’s argument is different but rests on substantially the same physical principles. 
It would be tempting to argue that as p~ x F(p)U(t) is the limit of a uniformly con¬ 
vergent series of analytic functions of t, expressible by power series, p~ x F(p) H(t ) is an 
analytic function of t and therefore has a derivative everywhere. This is not so; for in 
general the integral (1) does not exist for complex t, and the limiting process does not 
lead to a definition of p~ x f{t) off the real axis. 

12*11. Text-books on differential equations describe a method of finding particular 
integrals by expansion in powers of D = d/dt. This is valid if the function operated on is 
a polynomial, when the series terminates. Its relation to the expansion in powers of p~ x 
is as follows. We have ? 

where we take the principal parts of F(z)Jz m+x at all poles and then replace z by p. One 
such pole in general is z — 0. If then near z — 0 

F(z) = a 0 + a 1 z+ ... + cc m z m + 0(z m+1 ), (2) 

the corresponding terms in F(p) t m are 

w! (^ + ^I + — + a ™) 1 = a 0 t m + nicc 1 t m - 1 +...+ml<z m = F(D)t m , (3) 

which is the particular integral found by the usual method. The principal parts at the 
other poles a give terms with factors exp (at), which in the usual method are part of the 
complementary function. In the Heaviside method they have such coefficients that the 
function and its derivatives up to the (n— l)th vanish at t = 0 when the expansion of 
F(p) in negative powers of p begins with a term in p~ n . 

It follows that F(D) t m = ~ eZtdz > ( 4 ) 

where G is a contour surrounding the origin but not enclosing any other pole of F(z); 
while 

(«) 

The difference between the two methods can therefore be stated as a difference in the 
path of integration for the same function when the operand is a polynomial. 

If, however, the operand g(t) = t m is a positive fractional power of t, the interpretation 
of F(p) g(t) remains significant for all positive t. That of F(D) g(t) does not; for 

(oc 0 + a 1 D + ...)f m = ot Q t m + m<x 1 t m ~ x +... + a r m(m — 1)... (m — r+ l)t m ~ r +..., 

and if htx r z r converges like a geometric progression this series never converges. 

Most of the non-convergent series obtained by Heaviside were due to the fact that 
he never clearly distinguished p from D. 

12*12. Special contour integrals. The unit function can be transformed to give 
an integral that is fundamental in the theory of Fourier series. 

We can replace L by a path from —i oo to —id, and id to i oo, connecting them by a small 
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semicircle of radius S about 0, on the positive side. The first two paths give the principal 
value of a complex integral; 


>rr + r4l c <^ 

2^LJ-<oo Jis Z J 2m J-oo y 

(i) 

If" 3 . dy 

= - sin yt—. 

7TJ 0 y 

(2) 

The small semicircle, in the limit, gives |. Hence 


1 If®. .dy TT . . 

2 + sJ 0 sin » ( 7 = - ffW: 

(3) 

f Sin yt^- = \n (t> 0), 

Jo y 

(4) 

= -\tt (t< 0). 

(5) 


If t = 0 the integral is zero, so that we can write it in general as \ii sgn t. 


12*121. Another similar integral is, for real A, 



cos Ax 
1+x 2 



oiXz 


1 + Z d 


:d/Z. 


( 6 ) 


We complete the contour for A > 0 by a large rectangle on the upper side of the real 
axis. The contour contains a pole at z = i ; and the residue is e~ A /2 i. Thus 

g—A 

I = \ .2iri-^r = lne-\ (7) 


For the integral as stated this is the value if A is taken positive. But (6) would be 
equally correct if we took the negative value of A. In that case however we should 
have to complete the contour by a large rectangle below the real axis, and it would 
now contain the pole at z = — i, described in the negative direction. The residue at this 
pole is — e A /2i; hence we get 

I = re A , (8) 

which is the same as (7) since we are now using the negative value of A. 

12*122. This integral can also be written 


I 


00 e KX dx 
— oo 77(1 +x 2 ) 


= e iK ( 3 (*)> 0 ), 


= e~ 


(8(*)<0),J 


(9) 


where k is purely imaginary. In this form it appears in probability theory as the 
characteristic function of a certain probability law.* 


* Cf. Jeffreys, Theory of Probability , p. 76. 
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12*123. Take again 


f. 


00 cos ax — cos bx 


dx = iSftP 


Poo 

i 


oo fiiaz _ ftibz 


dz. 


( 10 ) 


o 

For b > a > 0 we take a large rectangle as in 12-121, indented by a small semicircle about 0; 
there is a simple pole of residue i(a — 6) at the origin, and the integral is 

y$t{(m)i{a-b)} = \n{b-a). (11) 

12*124. For real a we know that 




(l) 


But if we take a path going to infinity in any direction such that — \n< argz < \n the 
integral will be the same; for we can complete the contour by an arc at a large 
distance, on which | zexp (- \az 2 ) | can be made arbitrarily small. Putting 


we have 


/; 


z = r exp ioc, 

g-Viar 2 exp 2 ioc 


/ TT \ 1/a 

= w ( 


( 2 ) 


whence, separating real and imaginary parts, 


e _l bar* cos2* cos (|af2 s i n 2 a) dr 


I TT \ 1/a 

= w 


j; 

r°° / TT \^ a i 

I e -V 2 ar* cos 2 a s i n {^ar 2 sin 2a) dr = J sin a! 


— \n < cl < \n. 


(3) 


It can be shown that at the limit a = the integral still converges and is continuous 
for \a\< In. (Cf. M24.) Hence 

JVwa,.*' = j„” OOS(i ° r2) * ”5© k ’ jo S “(^ r2 ) d ’' - i(3 ** (4) 

These integrals are the basis of the methods of steepest descents and stationary phase 
for the approximate evaluation of complex integrals, used especially in the theory of 
dispersion of waves and the theory of probability, including statistical mechanics. 

12*125. The theory of the factorial function makes use of the integral 

z a dz 


P00 

Jo 






(1 + z) 2 ’ 

where a may be fractional or complex but 
1 > 91(a) > — 1. Then there is a branch point at 
z = 0. Take 

J<tS>=°- < 2 > 

around the path shown, where arg (ze in ) is taken 
as zero for z between — 1 and 0. The large arcs contribute 0 in the limit since 9ft(a) < 1. 
The pole at z = -1 contributes (- 2ni) ( -a) in the limit, since the residue is -a, and 
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the paths from and to — oo give contributions cancelling in the limit, since — 1 is not 
a branch point. The small circle about 0 gives 0 in the limit since 91(a) > — 1. Thus we 
are concerned only with the integrals between 0 and + oo. But these are 


<**•-«-)£ -2isin™j o 

since arg (ze i7r ) is — tt on the upper line and + tt on the lower. Therefore 

&dz 


z?dz 


f® zPdz 


r. 


(1 +zf 


na 

sinzra' 


12*126. Next consider the integral,* possibly for fractional m, 

z m - 1 e zt dz. 




(3) 


( 1 ) 


which converges ifmcl. If m > 1 the integral along L doe s not e xist. But if we modify 
the path to L' so that If has asymptotes in the third and second quadrants, not parallel to 
the imaginary axis, the integral will converge for t real 
and positive without restriction on m. We therefore con¬ 
sider the integral along L'. Now this is equivalent to an 
integral along M, which consists of a small circle about 
the origin and two lines from and to — oo. In the first Q 
place we take m> 0; then the integral about the small 
circle tends to 0 when the circle is made arbitrarily small. A 
We take z to be fie iw and per i ‘ K on CD and AB respectively. 

Then on CD 

z m ~ 1 e? i dz = e mni /i m ~ 1 e~f d d/i (2) 

and on AB similarly it is e~ mni /i m ~ 1 er^dji. Hence 

If If 0 If® 

z m - 1 e zt dz = —.\ e- mni u m - 1 e-i d du, + —.\ 

2m 2m J „ r r 2mj 0 



1 f® 

-sin rmr ju, m ~ 1 e~f a dju, 

w Jo 


7T 


_sin rmr (m — 1)! 

“ 7T ti™ ' 

But by a known identity (15-02) 

sinm7r 


mir 


and hence 




1 


z m ~ x e! d dz = -—,, 

m t m { — m )! 


(3) 


(4) 


(5) 


for 0 <m. But both sides of this equation are analytic functions of m; hence it is true 
for all m. (The singularity at z = 0 does not affect this statement. It is proved for 
0< m when the circle is arbitrarily small; it is therefore true for 0 <m for any size 


* Chapter 15 should be read before this section. 12-13 and later sections are independent of 
Chapter 15. 


JMP 
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of the circle; and with a non-zero radius of the circle the question of divergence at 0 does 
not arise.) Changing m to —n we have therefore for all n, if t > 0, 

t n 

P~' a(t) = n\' (6) 


if the left side is determined by an integral on If or M ; and this is also true for the path 
L if n> — 1. In particular, since 


(-i)! = A (f)!=f-W^ (-1 )! = —2>, = • 


and in a special sense 


iAff (*) = - 


1HJ 

2>’ 




If n is a negative integer n! is infinite and p~ n H(t) = 0. 

Operators with n ^ — 1 are not valid in the strict sense that they can be interpreted by 
integrals on the Bromwich path. They require modification of the path to If or M to 
give them a meaning. They do not arise directly in the solution of physical problems. It 
often happens, however, that operational solution leads to an integral valid on the 
Bromwich path, but the easiest way of evaluating it is first to modify the path to If or M 
(subject to there being no singularities between the paths) and then to expand in ascending 
powers of z on L ' or M. The integrals on L corresponding to n > — 1 will then be as 
stated in (6); but if n — 1 this formula should be regarded simply as an aid to memory, 
as a shorthand for 

f e zt z~ n ~ 1 dz = — (t> 0). (7) 

27 tiJm n\ 


If we differentiate this under the integral sign with regard to n we get another uniformly 
convergent integral. If n > — 1 it converges on L. Then we have 


1 f _ n j * n log* t n d . 

eztz-n-1 log z.dz =-p- H — : —log 71 !, 

° n\ nldn** 

or, in operational form, . Id. , . \ P 

p~ n log p S(t) = \ log n ! — log tj —| H(t). 

In particular, if n = 0, logp H(t) = (— y — log i)H(t), 


( 8 ) 

(9) 

( 10 ) 


where y is Euler’s constant (cf. 15-04). 

Specially important operators are p 1/2 exp (— ap 1/2 ) and exp (— ap 112 ), where a is a positive 
constant. Interpreted by integrals on L they are significant, since on L the argument of z 1/a 
is between ± \tt and the integrand decreases exponentially at the limits. Then 


exp( —f 

(11) 

exp (-op%)£T(<) = 2 

(12) 





12-126 exp (-ap^)H(t) 403 

These can be evaluated in two ways, both of which have many other applications. 
First, we notice that the integrals are unaffected for t > 0 if we use the paths L' or M. 
Also exp (— az 1/a ) can be expanded in an absolutely convergent series. Hence we can 
reverse the order of integration and summation (1-111) and write 


If / 00 a n ?}^ n \ 

P lk exp(-ap 1/a ) H(t) = ~. J j&z-'h -1 ) n n , Jdz 

1 00 r 

= (-i)» 

2m n -o J u 




= S(-l) n a n 


P 


,Va(«+l) 


H(t) = S(-l ) n a n 


t~ 1 k(n+ 1 ) 


n\ “ v ' / ' ~ n\{ — £(w+l)} 

in the sense of (7); then all terms with odd n vanish, and if n = 2m 

1.2... 2m <Jn 


H(t) (13) 


n\{—\(n+ 1)}! = (2m)!{—£(2m+1)}! 


(-i)(-|)...(-i(2m-l)) 

— ( —2) m 2.4... 2 m^TT — (— 4) m m! jn. 


Then for t > 0 


1 00 n 2mf-m-V2 1 / n 2\ 

p* exp(-«*) H( t) —c s.(- r-v^r = (-5)• 


(14) 


(15) 


V 77 " W4 = 0 

The effect of removing the factor p 1/2 is that it is now the terms with even n that 
vanish, with the exception of the first; we have now 


« fr^hn 00 n 2m+lf-m- 1/2 

exp (- ap 1/a ) H(t) = S (~ ) n a n ^—r - 1 - £ — TT7V7 - TTi 

' w! m " 0 (2m +1!) (~m —|)! 


»=o 


= 1 


t^TT TO =0 (2^ 3 




+ 


2!5\2 yjt] 


If we define 
we have 
and hence 


erf a; 


2 C x 

= -r e~ u2 du, 

V 77 J 0 

eltx = H x -& + ¥Ts 3f —•)’ 

exp (— ap 1/a ) H(t) = 1 — erf - 0 


Alternatively, return to (11) and write 

zt—az 1/2 = 2 


2^2* 


/ „ a\ 2 a 8 

V 2 22/ 4/’ 


(16) 

(17) 

(18) 

(19) 

( 20 ) 


With a and 2 > 0, a/22 is positive and we can take a path for z lf2 through it parallel to the 
imaginary axis. Put then 


1 / a 

2 V 2 —— = 
22 


then 


p lli exj)( — ap 1 l*)H(t) = ^ J*^exp(—^ — 

1 / « 2 \ 

= VM exp (-W- 


( 21 ) 


26-3 
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Also if (12) is differentiated with regard to a we get (11) with the sign changed. Hence 


exp(— ap'»)H(t) = 

But if a = 0 (12) is simply H(t). Hence 

exp (-ap*)H(t) = l_J“-A.e X p(-9«& 


( 22 ) 


= 1 — erf 


2 V<* 


(23) 


12*13. The generalized principle of superposition. The operators just inter¬ 
preted do not satisfy our original rule that F(z) shall be expansible in powers of z _1 when 
| z | is large enough. The question is whether the interpretations will still satisfy the 
principle of superposition. Let 


m 




a) 


Then it can be shown that under suitable restrictions 


F(u) = u 


JJ e-**/(t) dt. 


( 2 ) 


Take 9d(w) greater than c; F(z) is analytic for 9fJ(z) ^ c; and let the integral of F(z)/z 2 
around an infinite semicircle to the right of L be zero. Then if we may invert the order 
of integration 


/*oo /*oo 1 f'c+i 


c+ico 


dz - 
z 2m 


i 

ui J c—ioo r, J q 


JL f c+i °° uF{z) ^ 
2ni J c -ia> z(u — z) 


But 9t(w) is greater than c; hence if we deform the path L into an infinite semicircle to the 
right of L we pass over the pole at u, and over no other singularity. But by hypothesis 
the integral along this semicircle is zero. Hence the last integral is — 2iri times the residue 

/*oo 

at this pole, and therefore* u e~rtf(t) dt = F(u). The justification of the inversion of the 

order of integration will be carried out later when we come to the Fourier-Mellin theorem, 
which is really the converse of this. 

Now suppose that we have two operators F(p) and G(p) satisfying the conditions just 
imposed on F(p). Then g(t — r) = 0 if r > t, and 


But 

hence 


JVfr) g(t-T)dr = /(r) g(t -r)dT-^J o /(r) drj^e^ G(z)~. 

J q /(rje-^dr = F(z)/z; 

J/(r)g(t-r)dT = ± j^F(z) G(z H (t). 


(4) 

(5) 

( 6 ) 


* This argument is due to S. Goldstein, Proc. Lond. Math. Soc. (2) 34, 1931, 104. 
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If we differentiate we have 

F{p)G(p)H(t) = jj(T)g'(t-T)dT+f{t)g(0), (7) 

which is our previous rule of superposition, extended to a much wider range of conditions. 
Thus again the result of the successive application of two operators whose interpretations 
are known can be reduced to a single integral. 


12*14. Integral equations of Abel’s and Poisson’s types. If 

0 (£)/(*-£)<*£ = g(x), 


/; 


a) 


where/and g are given functions, and we wish to determine $ so that (1) will be true for 
all x, we introduce the operational expressions indicated by capital letters and apply the 
principle of superposition. Then (1) is equivalent to 


(P)P(P) = G{p), 


®(P) = 


pG(p) 

P(p) ’ 


(2) 

(3) 


whence <f>{x) is found by substituting in the Bromwich integral and interpreting. This is 
Abel’s type of integral equation. 

Poisson’s type <j>{x)+ <£(£)/(*= g(x) (4) 

is treated similarly. We get 


®(p)+-®{p)F(p) = G(p), 

Jr 

Qtp) = -ggfo) 

{P) p+F(py 


(5) 

( 6 ) 


whence the solution follows. Examples are given by Goldstein.* 

12 * 15 , The staircase, parapet, and saw-tooth functions. Consider the series 

(e~ph + e - 2 P h + e -3r*+...)B(i) = ff(t-A)+H(t-2k)+...+ff(t-nh) + .... (1) 

If [t/h] denotes the integral part of t/h, all terms with n < t/h are 1, all with n > tfh are 0, 
and the number of the former set is [t/h]. Hence the function is equal to [t/h]. It is equal 
to 0 for 0<t<h, and rises discontinuous^ by 1 at t = h, 2 h, 3 h, .... It behaves like the 
date, given as a function of time. Now if we translate this by Bromwich’s integral p is 
replaced by z, whose real part is the positive quantity c\ then the series is a convergent 
geometrical series, and 

g —zh g—1/2 zh 


Hence 


—zh _j_ g—2 zh 


[*] ~~ 2 mj z 


+ ... = 


2 sinh \zh ' 


1 —e~ zh 
e *t-lhh) fa e -Vi9h 


' l 2 sinh \zh z 2 sinh \ph 
* Joum. Lond. Math. Soc. 6, 1931, 262-8. 


H(t). 


(2) 
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Similarly the series (£ — e~ ph + e~ 2ph — e~ 3ph + ...)H(i) (3) 

is equal to + \ if [t/h] is 0 or an even integer, [ — |] if [t/h] is an odd integer, and may there¬ 
fore be written as £( - The series also can be summed like a geometrical progression 
and gives /ii_ e -*M 

= a^nhiph)H(t) = i(-ipi (i>0). 

These two functions are fundamental in the operational treatment of waves. 


(4) 



r~i.i~i 

0 II_12 31_b 

■J-tanh \ph H ( t) 


The average of t/h for t between nh and nh + h is w + - = ■=■ +Then the 

t Vt~\ “ LAJ 

function— I — i is — 4 at < = 0, increases uniformly to £ at t = h, drops discon- 
h \_hj 

tinuously to — ■£, and then repeats itself periodically. Its operational expression is 

(5) 

Now the first stage of the Euler-Maclaurin formula of integration could be written 

Fmat = i/(o) +m+f( 2 h )+... +/«»-1 )h}+ i/(»fc)-- [j] - £]/'(*) at, (6) 

JiD 

and the operator ~kd_ 1 + — 1 (^) 


occurred in the operational derivation of the Euler-Maclaurin series. The similarity of 
the two expressions suggests an intimate relation, but the operator (5) acts on B(t) and 
cannot be developed in ascending powers of p, whereas (7) must be developed in ascending 
powers of D. A relation probably exists, but it does not seem likely to be simple enough to 
replace the development of the Euler-Maclaurin formula by integration by parts. 

12*16. Frullani’s integrals. Consider 

/•oo djgft 

/= Jo {/(-)-/(**)}*• 

f(x) satisfying certain conditions that will appear as we proceed. Break up the range of 
integration into 0 to 8 and 8 to oo, where 8 is arbitrarily small. Then if the integral I 
converges at the lower limit the integral over 0 to 8 is arbitrarily small. Then 

/*00 <7 ry* 

I = lim I {f{ax)-f(bx)}~- 

S-+0J8 x 


.. ( f 00 ,, s du f ® ,du\ 

= hm | f(u) — - /(w)—, 

1J a# u JbS u ) 






12*16 Frullani's integrals 

provided the two integrals converge; and this is 

r b * du 

But if/(w) tends continuously to a finite limit as w -> 0 this approaches 

/(°+, f M ^=/(°+> i°g|. 

JaS U a 
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Then 

In particular 


r. 


I = /( 0 + )log-. 

(cos ax — cos bx) ~ = J ( e ~° x ~ er bx ) — = log -. 


Neither integral can be treated by contour integration. The first integrand is an odd 
function and the integral obtained by changing the range to (— oo, oo) vanishes identically. 
The same method can be extended to many integrals of odd functions. Take, for instance, 


I 00 sin 3 a. If “3 si 


sin x — sin 3a; 


x* 


dx 


dx 


1 f 00 < 

= lim - I (3 sin x —sin 3a;) 

ii. / f* 0 . dx f« . du) .... f** . dx 

“ 3sm^-J M 3sin = fhmj, 


= | log 3. 


EXAMPLES 


/ CO /j «4 

-. 

o l +* 8 

O CSV. 4.V. x f°°sin** f 00 sin* f 00 sin** , 

2. Show that —= - dx, —- ■■ d* = ^(1+ e~ 2 ). 

Jo * Jo * Jo*(l+* a ) 


3. Find the principal value of 

4. Prove that as X tends to infinity 

' x cos ax — cos bx 
o as a 


I 


00 tan* 

0 * 


d*. 


/; 


6. If Z? r are Bernoulli’s numbers, prove that 

Bf_i B r _. 

; + — . + ...+ 


l!(r — 1)! 2 !(r — 2)! 


(r — 2)! 2! 2(r— 1)! rl 


H—: = Oo 


6. Prove that 


* « ( 2 *)* „ (2*) 4 
*cot* = 1 —+ 


2! ■ 4! 

7. a and 6 are real and positive and unequal constants. Find 

dx 


/; 


0 (**-fa 2 ) 2 (**+&*)** 


(M.T. 1943.) 


dx - in{b -a) + 0 ^j (o > 0, 6 > 0). (M.T. 1938.) 


Is the result valid also when a = 6? 


(M.T. 1935.) 
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Examples 


8. Prove from Liouville’s theorem, by considering z n //(z), 
degree n has at least one zero. 


that any polynomial f(z) of positive 
(M.T. 1935.) 


9. Show that the roots of the equation 

z 6 — 12z 2 + 14 = 0 


lie between the circles I z I = 1 and I z I = How many of the roots lie inside the circle \ z\ — 2 ? 

1 (M.T. 1939.) 


10. If /(z) = {z — a x )... (z-a„), prove that all the zeros of /'(z) lie within any polygon with no 
re-entrant angle that includes all the zeros of/(z). 

11. If | /(z) | is constant on a closed contour on which/'(z) nowhere vanishes, show that if/(*) is 

analytic within the contour it has one more zero than/ / (z) within the contour. (Macdonald.) 

12. If 0(z) = 6 0 + & 1 z+..., 

where b x =t= 0, the root of <p(z) = 0 nearest the origin is the coefficient of - in the Laurent expansion of 

—log^-^ = — log|& 1 + Y + (&2$+.--)j* 

(W. R. Andress, Math. Gaz. 27, 1943, 92.) 


13. Show that 


*°° sin (a; + a) sin (a; — a) 

J_oo * 2 -o a 


dx = —sin 2a. 
2a 


14. If f(z)/z is uniformly bounded on a set of contours defined as in Mittag-Leffler s theorem, 
tending to infinity and /(z) has simple poles cl t of residue fi r , prove that 

saimac . . n . 

15. Evaluate the mtegral 2 (m> 0). 

J 0 •*'(•*' "r a ) 


16. 


By integrating the function 


G az 

-(0<a< 1) around a rectangle of breadth 2i t, prove that 

1 + e z v 


J 


00 c aX 


e“" , 7T 

dx = 


_oo 1 + e a sina7r' 

and derive an alternative proof that z l(—z) l = nz cosec irz. 


17. Prove that for & > 0 


j: 


* dK x , X 

sin kx e~ Kh — = tan -1 -. 

K h 


18. Prove that 

19. If for small | z | 
prove that 


r^ dt= 2 r^ , t-=i^ 

Jo< a -l Jo 

secz = 1 + Sa 2r z 2r 

1 1 7r 2r+1 

1 — - *4“ - *“ • • • — — Cl Of 

02r+l ^ g2r+l 2 2r+2 


(I.C. 1944.) 





Chapter 13 

CONFORMAL REPRESENTATION 


Plus ga change, plus c’est la meme chose. 

alphonse kabb, Les Gfuipes , 1849 


13*01. Conditions for conformal mapping. Let £ and z be two complex variables 
related' by 

where/(£) is an analytic function. Then if £ moves along a curve in the plane of £, tj, z will 
move along a curve in the plane of x, y. Every value of £ that/(£) is defined for will identify 
a point in the x, y plane, and conversely if the inverse function exists for a value of z a 
value of £ will be identified. Thus (1) can be regarded as a transformation , enabling us to 
map at least a part of the £, rj plane on the x, y plane and conversely. 

The importance of this type of transformation rests on the facts that it is, in general, 
continuous and conformal. For, in the first place, take y a small complex number and 
consider z + c =/(£+y). We have 

c=M+y)-m = y{f'(Q+v}> 

where v tends to zero with y. Hence if/'(£) exists both the real and imaginary parts of c 
tend to zero when those of y do. Thus the transformation from £ to z is continuous. Con¬ 
versely, that from z to £ is continuous provided that d^jdz = 1 //'(£) exists. But in the 
neighbourhood of a point where/'(£) or its reciprocal does not exist the transformation or 
its inverse may not be continuous. 

Now take two small complex numbers y and y' and consider the behaviour of 


c' M+y')-M) y' fiQ+v' 
c M+y)-M) y f'(Q+v’ 

where now v tends to zero with y and v' with y'. Then when y and y' tend to zero, in any 
specified ratio, c'/c tends to the same ratio, provided again that/'(£) has a definite value 
different from zero. If it were zero, the limit of c'/c would of course depend on the limit 
of the ratio of v 1 fv, about which we are not yet in a position to say anything. Apart from 
these special cases, then, we have 

lim— = lim —, 
c y 

which is the same as the pair of relations 


lim 



lim (arg c' — arg c) = lim (arg y' — arg y). 


That is, if we take a small triangle in the £ plane its vertices will determine those of a small 
triangle in the z plane, and if we take a sequence of triangles of the same shape in the £ 
plane, the size tending to zero, the corresponding triangles in the z plane will tend to the same 
shape, always providing/'(£) =# 0. This is what we mean by saying that the transformation is 
conformal. For any two intersecting curves the angle of intersection is unaltered by the 
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transformation, and to pass from one to the other we must travel around the point of 
intersection in the same sense. In particular since the axes of £ and 77 are at right angles, two 
curves in the z plane on which £ and 77 are respectively constant cut at right angles. Corre¬ 
sponding to straight lines parallel to the £ axis we have a family of curves in the z plane, and 
corresponding to straight lines parallel to the 77 axis a family of curves intersecting the first 
family at right angles. We shall refer to these as the families respectively of 17 constant and 
£ constant (£ and ij respectively of course varying along the same curve). 

The transformation is therefore conformal and continuous except possibly at places 
where /'(£) or 1 //'(£) does not exist. We have therefore to consider what happens at 
singularities and stationary points of/(£). First take a branch point. Since/(£) does not 
return to its original value when £ makes a circuit about the point, more than one point 
in the z plane will correspond to a given point in the £ plane unless we restrict the region 
in the £ plane in such a way as to prevent such circuits. Similarly for stationary points, 
if m is an integer > 1 

f(£)=Mo) + ct(£-£ 0 r+..., 


£ will not be uniquely determined as a function of z. For a map to be useful it is clearly 
necessary that there shall be a one-one correspondence between the place on the ground 
and that on the map. Simple poles of /(£) do no harm in themselves. But if /(£) has a 
multiple pole or an isolated essential singularity anywhere £, for given z, will take a 
given value more than once in the neighbourhood. This applies equally if the singularity 
is at infinity. Apart therefore from unisolated essential singularities, which would be 
difficult to treat either in general or in particular, the transformation can be one-one 
over the whole plane only if /(£) is a rational function with at most simple poles and no 
stationary points, and behaving at infinity like either £, a constant, or l/£. This limits 
us to functions of the form 


m-a+b£+ s A- 

r = l 0 , 


/'(£) = *- 


n 


s 

r-1 


a r 


/'(£) has n double poles in a large circle C. But 


1 rng 

2 ”ijcf'(Q 




1 

2ni 


= 0 


if 6 + 0 


= — 2 if 6 = 0 , 2 a r + 0 . 

Hence/'(£) has 2 n zeros if 6 + 0 , 2n— 2 if 6 = 0 and 2a r + 0 . If 6 = 0 and Ea r = 0 , /(£) — » 
behaves for large £ like £~ m , where m > 2 , and the transformation from z to £ is not 
single-valued. Hence for £ to be a single-valued function of z, either 


or 


6 + 0,71 = 0, /(£) = a + 6£. 

6 = 0,71=1, /(£) = « + £“-. 


These are therefore the only transformations that need no cuts to give a one-one corre¬ 
spondence over the whole planes and do not involve unisolated essential singularities. 
The first is trivial, being merely a combination of a displacement and a change of scale. 
The second is interesting. We shall consider it in a simplified form. 



13*02 Scale factor 

It wiU be found that successive application of transformations of these forms yields only 
a transformation of the same form with different constants. 

In other cases cuts must be made so that zeros of /'(£) and all singularities of /(£) lie 
on the boundary or outside it. When they are made the departure from conformality at 
the exceptional points is found to provide a means of transforming simple boundaries 
in the £ plane into an extraordinary variety in the z plane and thereby arriving at solutions 
of apparently very complicated problems. 

We shall speak of any point in the z plane or the £ plane, such that/'(£) has a limit other 
than 0 and oo when the point is approached, as an ordinary point of the transformation; and 
any point such that/'(£)-►(), | /'(£) | ->oo, or/'(£) has no definite limit on approaching the 
point, as a aing nlar point of the transformation. That is, we call a point a singular point of 
the transformation irrespective of whether z has a singularity when considered as a function 
of £, or £ has one when considered as a function of z, just as in considering multiple integrals 
we called a transformation singular if the Jacobian tended to either 0 or oo. 

13*02. Transformations: scale factor. If now -F(£) is an analytic function 

z = /(£)> w = F(£) = f + ift, 

w is also an analytic function of z. Therefore when <j> and f are expressed in terms of x 
and y they satisfy Laplace’s equation in two dimensions. If one of them is constant over 
a curve in the £ plane, it will be constant over the transformed curve in the z plane. 
Therefore if we can find a solution of Laplace’s equation in a region of the f, rj plane, 
which is constant over a given boundary in that plane, then the same function expressed 
in terms of x, y will be a solution of Laplace’s equation in the transformed figure and will 
be constant over the transformed boundary. Consequently the method of conformal 
representation is capable of solving a great variety of problems; any analytic function 
will yield a new set. 

If da is an element of area in the £ plane and dS the corresponding one in the z plane, 


dS = 


d{x,y) 


da = 


dz 

d£ 


da, 


by the Cauchy-Riemann relations: | dz/d£ | is called the modulus or scale factor of the 
transformation. 

If £ describes a curve, points near the z curve will be on the left or right of the z curve 
according as the corresponding points in the £ plane are to the left or right of the £ curve, 
in consequence of the fact that rotations retain the same sense on transformation. Also 
the normal distances are multiplied by | dz/d£ |. 

If w has a logarithmic singularity at an ordinary point £ 0 of the £ plane, 

w — A log (£— £ 0 ) + <7(0 


= Alogjj^+G(z), 


where g(Q and G(z) are analytic; hence w in the z plane also has a logarithmic singularity 
with the same coefficient. If £ 0 is a singularity of the transformation the coefficient will 
be different, or the singularity of w may not be logarithmic at all. We must however always 
arrange for such singularities to be on the boundary or outside the region, and special 
attention must be paid to them to make sure that the physical conditions correspond. 





412 Use of complex potential 13*03 

13*03. Interpretations of complex potential in electrostatics and hydro¬ 
dynamics. 

Hydrodynamics. 

w = <f> + iijr, 

where (f> is the velocity potential and rjr the stream function. Then 

dw d(b ,d\jr dd> ,dd> 

dz dx 1 dx dx % dy ~ U W * 

where u, v are the components of velocity, and 

\dw\ 


dz 


= ( u 2 + v 2 ) 1/a = q, 


the resultant velocity. If we take a curve joining P and Q 

=/r tr* - 



so that Wp is the flux across the curve, measured per unit length of a cylinder with its 

generators perpendicular to the plane of x , y. The directions of ds and dn must be oriented 
as shown. 

A line source at z Q that emits fluid at a rate 2nm per unit length has the complex 
potential 


Electrostatics. 


mlog(z-z 0 ). 
w — <f> + irjr, 


where (f) is the electrostatic potential and xjr the charge function. Then 


dw 

dz 


= IHy), 


dw 


dz 


= {E% + El)\ 


E X ,E V being the components of electric field intensity. If a closed conductor connects 

P, Q and dn is the normal outward from the conductor, the surface density cr is —— ^: 

4:7Tdn 


and the charge on PQ is 


-ill*—ss 


\jr 


per unit length perpendicular to the plane. Moreover, if z = /(£) and if P', Q' and the curve 
joining them in the £ plane correspond to P,Q and PQ, (f> and \Jr are the same for both planes 
and the charge on P’Q' per unit length perpendicular to the plane is the same as that on PQ, 
although the surface densities at corresponding points are different. Hence the capacity of 
a conductor in two dimensions is invariant under conformal transformation. 

The sign is most conveniently checked by physical considerations. 

A line charge at z 0 of density e per unit length has the complex potential 


— 2e log (z-z 0 ). 

Current electricity 

We consider steady currents flowing in a sheet of thickness t and specific conductivity 
cr; if <j> is the electrostatic potential and j the current density, E the field, 

j = crE = — grad 
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13-04 Use of z= A/£ 

The flow of current across an electrode between two points P and Q is 



The complex potential corresponding to an electrode of small circular cross-section, 
leading in a current J at z 0 , is 


J 

2ncrt 


log (z-z 0 ). 


If there is a source of fluid or current, or an electric charge at a point where the trans¬ 
formation is conformal, then there is an equal source or charge at the corresponding 
point in the transformed plane. 

Suppose there is a source on the boundary at a point where it has a comer of angle a. 
If we consider the source to be entirely inside the boundary all the flow will go into the 
region. If on the other hand it is a point source actually at the corner only a fraction olJ2tt 
will go into the region. If at this point the transformation is not conformal and the angle 
between the corresponding lines is ft we must on this second supposition place a source 
a//# times the original one at the corresponding point in the transformed plane. 

We proceed to consider some special transformations. 

13-04. z = A/£ (A real). This gives 

£-«7 A£ Atj K _ Ax Ay 

z ~ Z 2 +V*’ y PW - ~ x 2 +y z ’ V ~ e*+y*- 

It therefore resembles an inversion, since | z | | £ | = | A |, a constant. But unlike an in¬ 
version, it does not make the directions of {x, y) and (£, tj) from their respective origins 
coincident. If £ describes a circle about the origin, z also describes one, but in the opposite 
sense, the outside of each becoming the inside of the other. But it still follows that, 
in general, circles transform into circles, unless they pass through the origin, when they 
transform into straight lines. 

As an example, consider a circular cylinder of radius a 
lying on its side on a plane. We take the axes as shown. 

Incompressible fluid is flowing parallel to the axis of x 
with velocity U at a great distance. The boundary con¬ 
dition is that the stream function \Jr is constant over the 
solid boundary and therefore has the same value over the 
plane and the circle. For large | z | the complex potential 
w must approximate to Uz. 

Putting z = A/£, we see that the axis of x transforms into the £-axis. The circle will 
transform into a line parallel to the axis of £, which is identified by taking z = 2 ia, which 
gives £ = —iA/2a. It is convenient to take A so that this is equal to and therefore 

A — —7TCL. 



net 

T 


Then 


z = — 
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Region between two planes 

For z large w approximates to Uz = — naU /£, which represents a doublet source at the 
origin. We must add extra terms to make w purely real on the boundaries rj — 0 and 
rj = \iir. For 7} — 0 this is already satisfied, but for rj = \in it is not, but can be adjusted 
by adding a term —naU /(£— in). This is analytic when 0 ^ r) ^ \n, and therefore admissible, 
but it disturbs the condition over rj = 0 and a further term — naU/(£+in) is needed. 
Proceeding in this way, and using 12*06, we have 

/I 1 1 1 1 \ 

w — — naUl -p+-z —:—H y , .—f- y — jp —f- y -—~—[-...) 

\£ £-wr £+m £-2 m ^+2m J 

— —naU coth£ 

= naU coth —. 

z 


To verify that this satisfies the conditions, notice first that it is real for z — x and approxi¬ 
mates to Uz for | z | large, as it should. On the circle the radius vector is 2a sin 0, and 


z = 2a&m.6e i0 , 

£ = — \n cosec 6 e" 49 = — %7r(cotd—i), 
w = — naU coth {\ni — \n cot 6) 

— naU tanh (\tt cot 0), 

and this is real. 

We are interested mainly in the velocity q and the pressure. We have 


UnW 

2 2 sinh 2 7ra/2 

When 2 —>■ 0 through real values ^->0; thus the velocity vanishes on the line of contact. 
At the top point z = 2 ia, £ = \in , 

q = | \n^U cosec 2 \n j = \n*U. 

The velocity at the top of the cylinder is therefore about 2*5 times that at a large distance. 
The pressure at the top is correspondingly lower. This implies that a stream tends to lift 
ob j ects lying on the bottom and is connected with the theory of the transport of sediments. * 



13*05. z — £, n (n real, n =}= 1). The transformation is not conformal at the origin. 
Since \z \ = |£| w , arg z — n arg £, lines through the origin in one figure correspond 
to lines through the origin in the other, and circles about the origin to circles about the 
origin. This transformation is therefore useful for problems relating to plane boundaries 
meeting in a line. If the £ figure is a pair of lines meeting at n , and therefore including the 
upper half of the £ plane, the lines in the z plane will therefore meet at nn. n cannot be 
greater than 2 for this £ figure because arg z would then exceed 2n for some values of £ and 
the representation would not be unique. The transformation covers planes meeting at 
angles ^ 2n. 


For the resultant force on the cylinder, cf. Proe. Camb. Phil. Soc. 25, 1929, 272-6. 
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Consider a line source of fluid at z = z 0 , between planes meeting at rm. The complex 
potential due to the source is 

« , o = -^ 1 °g (z-2 0 )> 


and that in the £ figure is accordingly -—log (£- £ 0 ), since z 0 is not a singularity of the 
transformation. The stream function is 


m 




V-Vo 


We require the complete stream function to be constant when y = 0. i/r 0 does not satisfy 
this condition, but if we add on the result of changing 7 0 to — tj 0 it will. Hence 


W = ~2n log ^~^°~ iv °) ^+ iVo) 
satisfies all the conditions. Therefore 


Tib 

W — — — log (z 1Jn — z}j n ) (z 1/n — Z* 1/n ) 

Z7T 

is the solution, where zj = x 0 — iy 0 . 

When n is the reciprocal of an integer this result can be found by the method of images. 
Otherwise the latter method fails because it leads to images, and therefore new singu¬ 
larities, within the region of flow. 

The conjugate velocity vector is 

. dw dw Idz 

dw/d£ is analytic at z = 0; but dz/dg behaves like £"- 1 = z x ~ lln , which tends to 0 if n > 1 
and to ooif n< 1. Hence the velocity will tend to 0 or infinity at the comer according as 
n< 1 orn > 1, that is, according as the angle occupied by the fluid is less or greater than it. 
In the latter case the classical theory fails to give a good approximation to the motion of 
a real fluid; it breaks down near a projecting comer for reasons connected with viscosity.® 

13*06. z — aet {a real). Since £ = log ( zja ), £ is a many-valued function of z and z = 0 
is a branch point. We have 

x = ae£ cos rj, y — ael sin tj 

and therefore every pair of values of x, y except (0,0) is obtained by letting tj vary over a 
range 2 tt and £ from — oo to + oo. The whole x, y plane is therefore mapped on an infinite 
strip between y = 0 and tj = 2 n, or if more convenient for the particular problem, 
— tt<tj^7t. Half the plane is mapped on a strip of width it. The circle | z | = a corre¬ 
sponds to the line £ = 0, 0 < tj < 2n; the interior to the semi-infinite strip to the left of the 
tj axis and the exterior to the right of the rj axis. In the z plane the curves of constant £ 
are circles with the origin as centre, and the curves of constant rj are straight lines through 
the origin. 

_ Cb 

13*07. Goaxal circles. £ = clog-. For simplicity we take a and c real. Then 
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Coaxal circles 


1307 


Here 0 1 -6 % is not unique and to make £, if one-valued we must prevent it from varying 
by more than 2ir. This can be done either by making cuts along the real axis from — oo 
to - a, and from + a to + oo, or by making one from - a to + a. We take the former case. 
Then 0 X and 0 2 satisfy 

O^0 x <2tt, — tt <0 2 <,n. 

The family of curves if = constant are a set of coaxal circles through A{ + a) and B( — a), 
while the family £ = constant is the orthogonal set of coaxal circles with A and B as the 
limit points. 




For the part of the circle C above the real axis 

0 1 — d i = APB = a, 

the angle in the segment APB. For the lower part 

(2n — 0 x ) + 0 2 = it — oc, 

that is, 0\ — 02 ~ n + a - 

Thus 0 X - 0 % changes discontinuously by ± it when P passes through A or B. For 0<a<n 
it is always equal to the angle APB on the side that faces downwards. 

As we take all such circles with a from 0 to 7 t,ij goes from 0 to 2 or. Straight lines rj = cct 
with 0 < a < 7 i correspond to arcs of circles in the upper half of the z plane, while those with 
7T <oc< 2 ti correspond to arcs in the lower half of the z plane, if = ctt gives the straight line 
BA. 7)->0 for points approaching the x axis outside BA from above, ij~>2cn for points 
approaching it from below. 

Now take the orthogonal system . i\ _ £ 

®r 2 ~ c 


For any point P take the circle A PB and draw the tangent to it, cutting the axis of a: in A. 
The triangles KPA, KBP are similar, and 

r x AP PK AK 
r 2 ~ BP ~ BK PK * 

Then sMrJ 2 ^- 

Therefore for given £, K is a fixed point. Also 

KP 2 = KA.KB, 



and therefore is constant. Hence the points P with given £ lie on a circle. If r x /r 2 is small 
the circle is a small one enclosing A; if it is large we get a small circle enclosing B. Thus 
the curves of £ constant are the coaxal circles with limiting points A, B orthogonal to the 
intersecting oircles of constant if. 




13-071 

Analytically, 


Coaxal cylinders 

r l = _ 2lle 

r\ {x+a) 2 +y 2 * 


x 2 + y % + a 2 

2 ax 


e Wc + i 
e 2 & c — 1 



* 2 + 2/ 2 + 2o* coth - + a 2 = 0. 
c 
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Thus there is a one-one correspondence between the whole of the z plane and the 
infinite strip of width 2cn parallel to the £ axis on the £ plane. 

This transformation is extremely useful in solving problems on coaxal cylinders. For 
example: 


13-071. A long hollow metal cylinder , whose equation is r = a, is divided into two parts 
by the plane sin 6=0. The parts are slightly separated and charged to potentials V x , V 2 , and 
two thin sheets of metal at potential V 3 occupy the regions r > a, 6 = 0 and r>a, 6 = n. By 
means of the transformation £ = log (z — a) — log (z + a), or otherwise, show that the surface 
density at a point on the inner surface of the cylinder is 


K-V2 

47 r 2 a sin 6 * 

and find the surface density on the outside of the cylinder and on both sides of the planes. 

By means of the transformation the upper side of — 00 B and Aco becomes 7 = 0, and 
the lower sides become y = 2n. BP A becomes y = \tt and BQA becomes y = f 7 r. 




The problem is thus reduced to one of parallel plate condensers. Considering the inner 
condenser between y = \tt and f 7r, we have the surface densities in the £ plane as 

± (Ti — )\Ait.tt. 

Hence that at a point on the inn er surface of the cylinder in the z plane is 


But 


d l 

dz 

dz 


■ V1-V2 <% 

47T 2 dz 

1 1 _ 2a 

z — a z + a ~ {z — a)(z + a) > 

2 a _ 2 a l 

r x r 2 4 a 2 sin \6 cos \0 ~ a sin 6 ’ 


JMP 


27 
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Hence the surface density is 


Non-concentric cylinders 

K-F, 


13 072 


47r 2 asin0* 

Similarly, that on the outside of the cylinder is 

V t -v s v 2 -v s 


2n 2 a sin 0 * 2n 2 a sin Q * 

on the upper side of the planes it is V x — V 3 a 


and on the lower side 


7r 2 x 2 — a 2 * 
V 2 -V 3 a 


n 2 x 2 — a 2 ' 

13*072. Capacity of a condenser formed by two non-concentric circles. Let 

two circles of the system, with c = 1, have radii a and /? (a > /?) and let the distance between 
their centres be d. They can be written 

(x + a coth £) 2 + y 2 = a 2 coseeh 2 £, 

and therefore a, — a cosech y /? = a cosech £ 2 , 

d = a(coth — coth y 


= a — 


sinh(g a -g 1 ) ^ 
sinh sinh£ 2 * 


Thus 


sinh (^ 2 -y = ^. 




If 0G 1 = X, OC 2 — X + d, then since a is the length of the tangent from O to all the 
circles 

a 2 = X 2 -/3 2 = (X+d) 2 -<x 2 . 


Hence 


2 dX = cl 2 — ft 2 —d 2 . 


4 d 2 

cosh 2 (^ 2 -y =~ 2 +l 


-P*> 


(a 2 - /? 2 - d 2 ) 2 - 4d 2 j3 2 + 4a 2 0* 
4a 2 /? 2 ’ 

a 2 + B 2 — d 2 
cosh (&-&) = 2a/? * 
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If now we have a condenser formed by two conducting cylinders in the z plane it 
becomes a parallel plate condenser in the £ plane, the separation being £ a — £ r The capacity 
per unit length is ^ ^ 

An | £ a — | “2 cosh -1 {(a 2 +/? 2 - d 2 )/2aj3} * 

and this therefore gives the capacity in the z plane. 

13*073. Another form of the transformation is 


that is. 


z—a 

z+a 


= e&. 


z e& c +1 

a 1 — e^° 



The restriction of a and c to be real is unnecessary; a more general form is 


£ = clog 


Z-Za 


Taking c as a pure imaginary is equivalent to turning the axes in the £ plane through a 
right angle. 

The transformation is related to some of the most fundamental problems in two dimen¬ 
sions. Thus, in electrostatics, if there are charges A per unit length along two parallel wires 
the complex potential is 

w = — A log-—-, 

6 z + a’ 


and it follows at once that the equipotentials are coaxal circular cylinders with the wires 
as limiting lines, and the lines of force are the orthogonal set of circles. Similarly, in 
hydrodynamics, for parallel lines of sources and sinks of equal and opposite strength, 
the coaxal cylinders about the lines give the surfaces of equal (j>, the orthogonal surfaces 
the stream lines. Analogous relations for the flow due to a pair of line vortices, and the 
magnetic field due to a pair of electric currents, will suggest themselves. 


13*08. Confocal conics, z = c cosh £ (c real). 

x + iy = c(cosh £ cos y + i sinh £ sin tj), 
x — c cosh £ cos 7j, y = csinh£ sin 77 . 
Then the curves of £ constant are 


r 


c 2 cosh 2 £ c 2 sinh 2 £ 


= 1 . 


These are ellipses with foci at (+ c, 0 ), axes c cosh £, c sinh £. y is then the eccentric angle. 
If y = constant 


x‘ 


r 


</ COS^ 7j C z Sin 2 7) 


= 1 . 


These are hyperbolas with the same foci. 

It must be noticed, however, that a particular value of y does not give the whole 
hyperbola. If 0 ^ 77 < | 7 r and £ ranges from -00 to 00 , we get the right branch; if 
\tt < y < 7T we get the left. On the other hand + £ and — £, where 0 < tj 2tt, give the 
same ellipse. 

Problems relating to elliptic boundaries are much more frequent than those for hyper¬ 
bolic boundaries. We therefore make the restriction £ ^ 0 , and then get the whole ellipse 


27-2 
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by taking 0^7)<2tt. But now with £ positive we do not even get the whole of one branch 
of the hyperbola for a particular value of y ; in fact if y is an acute angle the parts of the 
hyperbola in the four quadrants are given by y, n — y, n + y, 2n—y. 

With the restrictions made, the semi-infinite strip of width 2n in the £ plane transforms 
into the whole of the z plane. The function cosh -1 ( z/c) is many-valued, but we have made 
it single-valued by making a cut in the z plane from (— c, 0 ) to (oo, 0 ). The upper half of 
the z plane transforms into the semi-infinite strip £ > 0 , 0 ^ y ^ n, and the lower half into 
£ > 0, n < y < 2 7T. With £ = 0 ,z travels from (c, 0) to (— c, 0) and back as y increases from 0 
to 2 it. The transformation is not conformal at z = ± c, and the angle n between adjoining 
parts of the x axis from these points transforms into an angle \n at the corresponding 
points in the £ plane. 

Suppose now that we have to solve V 2 0 = 0 in a region bounded by an ellipse of the 
system. The solution must have period 2n in order that 0 may be single-valued; for if 
not we should be in the situation of having a waistcoat whose buttons did not come 
opposite to the buttonholes when it was put on. This condition is satisfied by any linear 
combination of e n £ and for integral n. If we are concerned with a region containing 
the lin e of foci a further condition is needed. The cut in this case is not a physical barrier 
but a mathematical device, and must introduce no discontinuities in the physical problem. 
Not only 0, therefore, but d<fijdx and 30/3?/ must be continuous on crossing the line of foci 
and thereby changing y into 2rr—y. But the normal to the line of foci, on either side, is 
the direction of increasing £, which is that of increasing y on the upper side and decreasing 
y on the lower. In fact 


30 30 dx d(j> dy 

M^dx^dydy 

d<j> d<f> dx d<j> dy 

dy dx dy + dy dy 


/ . , _ 30 i r • ^0\ . d(j> 

c I smh £ cos y + cosh £ sm y ^ j c sm y ^, 

c I — cosh £ sm y ^ + smh £ cos y ^ j — c sin y ^, 


and if 90/9z, 30/9 y are to have the same values for £ = 0 on replacing y by 2n-y, 30/9£ 
and 00 / 3-17 must either be 0 or reverse their signs. The admissible solutions are therefore 
cosh%£ cos ny, for which 30/9£ = 0 at £ = 0 and d<f>/dy reverses its sign, and sinh w£ sin ny, 
for which dfi/dy = 0 and 30/3£ reverses its sign. The solutions cosh w£ 8mny and 
sinhw£ cos ny are not admissible for a complete ellipse. They would enter if the line of 
foci was occupied by a barrier, for then 30/3#, 30/9?/ need not be continuous across it. 

For external problems the data will include information about the behaviour of 0 
for large £. The disturbance due to the presence of the elliptic boundary must then tend 
to zero for large £, and the admissible solutions are of the forms e -w ^(cos ny, sin ny). 

For the scale of the transformation we have 


dz . , „ 

— = csinh£, 

= c 2 (sinh 2 £ cos 2 y -f cosh 2 £ sin 2 y) 

= c 2 (cosh 2 £ — cos 2 17 ) = £c 2 (cosh 2 £ — cos 2 ?/). 


dz 

d£ 


Also jz 2 | = c 2 (cosh 2 £ cos 2 y + sinh 2 £ sin 2 y) 

= c 2 ( cosh 2 £ —sin 2 7 ) = |c 2 (cosh 2 £ + cos 2 i/). 
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13*081. Conducting elliptic cylinder in a uniform field of force parallel to 
the major axis. The potential of the undisturbed field is 

— Fx = — Fc cosh £ cos ij + constant. 

Suppose the conductor to be given by £ = a. Then the extra term due to the presence of 
the conductor must be of the form Aerl cos tj, for this tends to 0 as £-*oo, and it must 
cancel the variation of Fx over the surface £ = a. Thus 

Ae~ a — Fc cosh a = 0. 

Hence <f> = constant — Fc cosh £ cos rj + Fce a -£ cosh a cos rj 

= constant — Fce a sinh (£ — a) cos rj. 

The corresponding xjr is 

0 = constant — Fee * cosh (£ - a) sin ij. 

The charge induced on the part of the cylinder on the positive side of the y axis is 


47r [^-%» ^V“*fa7r] — 27 T ^' ce,a ~ 27 ~E(a+b), 

13*082. Elliptic cylinder of dielectric constant K in a uniform field parallel 
to the major axis. Let <f> 1 be the potential inside the cylinder and that outside. The 
only admissible forms are, apart from a constant, 

$0 = — Fc cosh £ cos rj + Ae~£ cos rj, (p 1 = B cosh £ cos ij. 

The boundary conditions are <f> 0 = 0 lt ^ at £ = a, since 0£/0w is continuous. 

Hence b ° b 

Ae~ a - Fc cosh a = B cosh a, - Ae~ a - Fc sinh a = KB sinh a . 


Then A = Fc 


K -1 


cosh a + K sinh a 


cosh a sinh a e a , B — — Fc 


cosh a + K sinh a ‘ 


13*083. Rotating elliptic cylinder filled with fluid. Let oj be the angular velocity. 
On the boundary the velocity components of the cylinder are ( -(x)y,ojx). But if ^ is the 
stream function 

d0 dflr 

ty =U ’ &c=~ V ’ 

and therefore on the boundary 

0- = j(udy — vdx) = —%o)\z\ 2 
= — £wc 2 (cosh 2 a + cos 2oj). 

The first term is an irrelevant constant, and we can take 

0 = - - S j|y cos 2 7}, <j> = | WC 2 ^|isin 2tj = ——_ X y 

cosh 2a " Y 4 cosh 2a ' cosh 2a 


Thus at points in the interior ( u, v) = 


O) 


cosh 2 a 


{y, *)• 


We have also, since 


, , b a 2 + 6 8 

tanh a = -, cosh 2a = -- 

a’ a 2 -b 2 
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Thus, as we should expect, there is no motion inside if the cylinder is circular. In case 
there is any doubt of the truth of the result, one may take a teacup with a floating leaf in 
it on the side next one’s mouth. On turning the cup round the leaf is found to be still on 
the side next the mouth. The result is of course true only for fluids of small viscosity. 

13*09. Generalized Joukowsky transformations. Suppose that we are given the 
equation of a closed curve in the z' plane in the form f(x', y') = 0. Then we cannot at 
once give an answer to the question: what is the transformation that will transform the 
closed curve to a circle | z | = a and the region outside the curve to the outside of the circle ? 
We have rather to examine various transformations and see what curve in the z' plane 
corresponds to | z j = a. It has, however, been found that many of the transformations 
important in mathematical physics belong to a very general class, and we first consider 
this generally before proceeding to special cases. Let 


*' = *£ fr> 
r- 0 z 


( 1 ) 


where the a r may be complex and a 0 =f= 0. In practice we usually take a 0 = 1. Suppose 
further that the series converges for | z | ^ a and that dz'/dz has no zeros outside the circle 
C defined by | z | = a. Then the transformation is conformal for all z outside the circle. 
When z travels round the circle C , z' will describe a closed curve C'. If dz'/dz has a simple 
zero on G, the curve C' will have a cusp at the corresponding point. Further, a large 
circle in the z plane with centre at the origin corresponds to a large closed curve, approxi¬ 
mately circular, in the z’ plane and conversely. If we proceed inwards from a point on 
the large circle in the z plane along a straight line to the origin, the curve in the z' plane 
corresponding to the part of the line outside C will have a unique tangent at each point. 
z is therefore also a single-valued function of z’ and will be expressible by a power series 
of the form 

oo h 

Z = * 2 Zf r 

r— oZ 

if z‘ is sufficiently large. 

With the further transformation z = ae^ we have that the part of the z' plane outside G' 
is represented on the semi-infinite strip £ ^ 0, 0 ^ y < 27r of the £ plane. Hence if a closed 
curve in the z' plane can be represented parametrically by 


x' + iy' = ae ir > £ — r e~ ril > ( a real, a r complex, a 0 + 0), 

r= 0 ° 

we can at once infer that it is the curve corresponding to £ = 0 of the family 


(2) 


00 ff CO S] 

z' = aeC s = Z S - r . 

r =0« r r=0 Z r 


(3) 


13*091. Ellipse. Take 


Particular cases 
1 ia + b ^_^a{a — b) 


With z - ae^ (the circle C) we have 

z' = x' + iy' — \{a + b)e i ' l i+ \{a — b)e~ il i 
= a coey + ib Binrj t 


(4) 


( 5 ) 
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and the curve C" is an ellipse with semi-axes a, b, and tj is the eccentric angle. This 
transformation is also often taken in the form 



( 6 ) 


so as not to alter the scale at infinity; then to the circle | z | = a correspond the two sides 
of a line from (2a, 0) to ( — 2a, 0) and back®. This can also be written 


z' + 2a _ /z + a\ 2 
z' — 2a~[z—a/ 


(7) 


13*092. Joukowsky aerofoils. Take 

tr-nji-j), s*,=o, ki-o. ku<o. (8) 

Then C has a cusp at z[; for near it we have 


dz' 


(9) 


where g(z) is analytic and not zero at z = z 1 , and therefore 

Z '~ Z 1 = i9 r (2l)(2-2l) 2 +.... (10) 

Hence as z travels along the circle and passes through z 1 , z r approaches z x and then 
recedes along a curve with the same tangent. 


If n — 2, z x — a, z 2 = —a, 


dz' _ a 2 

dz z 2 



and we recover (6). But if now instead of the circle C we take a slightly larger circle 
passing through + a but a little beyond — a and transform it, we shall get an Indian club- 
shaped figure with a cusp at z' — 2a and a rounded end at z' a little less than — 2a. If 
further we take the centre of this circle a little off the axis of x the z' figure will not be 
symmetrical about the x' axis. Thus two terms are enough to give a fair representation 
of the form of an actual aeroplane wing. The most serious departure from actuality is 
that all Joukowsky aerofoils have a cusp at the trailing edge, whereas the actual angle is 
not a cusp. This can be remedied by a further modification due to Glauert, as follows. 

13*093. Region bounded by circular arcs. We take now 


z' — (2 —n) a cos fi {z — ae~ i P\ 2 ~ n 
z' + (2 — ri)acoBp ~ \ z + aeW j * 

where n and p are small and positive. This is evidently the 
result of applying two transformations of the form 13-07 with 
different values of c, but now we are interested in the external 
region. We write 

arg (z - ae-W) = d x ' arg (z f -(2 -n)a cos 0) = d[ 
arg (z + ae^) = 0 2 arg (z' + (2 -n)a cos /?) = 0' 



where 6 X , 0 2 , 0' x , 0' 2 are defined to be zero when z is on BA produced, and vary con¬ 
tinuously as z travels on a curve outside the circle. 
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Then 


13 094 


Exterior of a polygon 

e , 1 — 0' i = (2—n) (0 1 —d 2 ). 


for points z outside C. At P, d 1 — d 2 — \n—(}. If P transforms to P', P' is on a circular 
arc through + (2 — n) a cos /?, with angle (2 — n) (\tt — fi) at the circumference. But if P pro¬ 
ceeds to near B and then travels round a small semicircle about B, d 2 increases by n and 
therefore 6[ — d 2 becomes 

(2 -n) ( \TT-fi)-{2-ri)TT = -(2 -n) ($ 7 r + fi). 

This is negative; we add 2n to give a positive angle, which 
will be the angle facing downwards as in 13-07. Then the 
lower arc gives an arc in the z' plane containing an angle 

2n— (2-n) (%7T+fi) = {2 + n)\TT- (2-n)fi. {2+n)\ir-{2-n) j8 


(2-n) (\rr-fi) 



If this is less than tt the lower arc in the z' plane will be con- * ‘ p ane ‘ 

cave downwards. The z' figure consists of two circular arcs intersecting at an angle nn. 

If instead of C we take a circle through A but passing a little beyond B we get a rounded 
leading edge. This is Glauert’s transformation. When z is large we find the first few terms 
of the series development to be 


z' = z- s ria&m.fi+ 


(1 — n) (3 — n) 


cos V-- + 
z 


13*094. Closed polygonal boundary. Consider 



where A is constant, and \z r \ — a for all r. Take 

z — ae^, z r = ae ir >r. ( 2 ) 

Then ~ = Be^-^rM jj {sin \{rj - Vr)}^, ( 3 ) 

dr) r=i 

where B is another constant. Therefore if 2 cc r = 2n, arg (dz'/dij) is constant. Thus as z 
describes the arc between z r and z r+1 , z' proceeds along a straight line. On passing half way 
round z r , however, r) — rj r changes sign and arg (dz'Idrj) changes by ±cc r . The curve C' 
is therefore a polygon with external angles ± a r . 


C 

We take first the external problem. Then z can travel positively about z r , and arg (z - z r ) 
increases by rr\ thus the curves are oriented as shown. 

The condition E a r — 2n is satisfied by the external angles of a closed polygon. If 
(1) is developed in powers of l/z, it is seen to contain a term in £ cc r z r /z, and therefore z' 
will not be single-valued unless we have also £ oc r z r = 0. Subject therefore to this con¬ 
dition and £ a r = 2n the outside of a circle C is transformed into the outside of a polygon. 
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13 * 095 - 13*096 Interior of a polygon 

The same transformation will not do for the inside of a polygon. For if we took the inside 
of the circle, then we should have to pass round z r in the negative direction, and arg (dz f /dy) 
would decrease by a r . Thus we should not get the inside of the same polygon, but the 
outside of its mirror image. 

On the other hand, ( 1 ) determines z' as a function of z, which could be continued into 
the interior of G by paths between any pair z r , z r+1 . But then we should have to make a 
cut along the whole of G except between these two in order to make z' single-valued; then 
this cut would have to be traversed internally and we get the same result as in the last 
paragraph. 

13*095. But consider the transformation 

z' = 21 b r z r . (4) 

r=l 

(There is no loss of generality in dropping a constant b 0 on the right.) Suppose that the 
series converges for | z | ^ a and that all zeros of dz'/dz lie outside or on this circle. As 
before, C corresponds to a closed curve C’ in the 2 ' plane, but now a small circle about 



2 — 0 corresponds to a small curve approximating to a circle. Thus the interior of C is 
mapped on to the interior of <7. The transformation of this type for a polygon is known 
and is of considerable interest. Take 



where A is constant and \ z r \ = a. As before we get 

— = iae^A n { — 2 i sin £(7 — y r )}~ IXr!7r , ( 6 ) 

drj 

and arg (dz'/dy) is constant for each arc provided Sa r = 2n. But now as 2 describes a 
semicircle about z r on the inside arg (2 — z r ) decreases by n and arg (dz'/dij) increases by a r . 
Thus the a r are again the external angles, but the interior of C corresponds to the interior 
of <7. 

13*096. These transformations are due to various writers.* 

The transformation of the outside of a circle into the outside of a general closed curve 
can be seen to be unique. For we can imagine the curve <7 to be occupied by a conductor 
carrying a given charge per unit length. Then the external field is determinate and with a 

* W. G. Bickley, Phil. Trans. A, 228, 1929, 235— 74 ; R. M. Morris, Proc. Carrib. Phil . Soc. 33, 
1937, 474-84; Math. Ann. 116, 1939, 374-400. 
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1310 


Schwarz-Christoffel transformation 

suitable adjustment of the charge we can arrange for the potential to behave like log/ 
and the charge function like O' for large r. Then 

<fi + iijr = log z' (/large). 

Now if z' = f(z) transforms a circle G into the curve C, log z is a function of z' whose real 
part is constant over C' and therefore must be equal to <j> provided that z' — z -> 0 for | z | 

large. Hence log z = <j> + ir/r, z — exp (<fi + ii/r), 

which is uniquely determined. In fact the external transformation is determined if the 
potential problem is solved, and conversely. The outstanding advantage of external 
transformations of this type is that they turn the outside of a dosed curve into the 
outside of another, leaving the scale and orientation at infinity unaltered. Hence if the 
form of the complex potential is known at large distances it can be adapted to the 
transformed problem by simply writing z for z'. In the internal problem if 0 is constant 
over a closed curve and V 2 0 = 0 in the interior, ^ is constant in the interior and tells 
us nothing about z. A variable <f> can be arranged by taking <j) = log / + <f>' within 6", 
where / = | z' — c | and c is within C', V 2 0' = 0 within C', and ^ = 0 on O'. Then if 
z = exp (<fi + iifr), z = z' — c for z' — c small, and |z| = 1 when z' is on C'. Hence the 
transformation represents C and its interior on the unit circle and its interior, an 
arbitrary point c within C corresponding to z = 0. 

In both problems the existence of a solution of the potential problem is physically 
plausible; the analytical proof is difficult. 


13 * 10 . Another class of transformations, closely related but somewhat better known, 
is due to Schwarz and Christoffel. We take, keeping z' for the transformed figure, 


d/ 

dt 


= AYl(t-t r )-«rl n , 


( 1 ) 


where the points t r lie along the real axis, which is taken as the path for t. The region taken 
is the upper half of the t plane. Then when t travels, necessarily in the negative sense, 
around t r , arg (dz'fdt) increases by a r . If 2 a r = 2n we get the interior of a closed polygon; 
if it is — 2n we do not get the exterior, because when t describes a large semicircle argz' 
will increase by 3 n. If 2 a r = n two sides are parallel and extend to infinity. 

To see the relation to 13*095 we put 


We get 


2a 2 . t + ia 


dz' _ 2 a 2 A „ ( 2 ia(t — t r ) ’!-*»•/* 

dt ~ ( t — ia ) 2 \(t — ia) {t r + ia)] 


( 2 ) 




(3) 


since S a r = 2 n. This is of exactly the same form as (1); the only difference is that the t r 
are on the real axis and the z r on the circle. 

The external transformation 13*094(1) behaves differently. Wefirstputz = a 2 /£ to trans¬ 
form the outside of the circle into the inside, and then take £ instead of z in (2). We thus get 


z = ia 


t — ia 
t + ia* 


(4) 


dz' _ 2a 2 A jj/ 2ia(t — t r ) W» 

dt ~' (t + ia) 2 \(t — ia)(t r + ia)) 


(5) 
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With Ea r = 2 tt this has double poles at t=±ia. Consequently there is a singularity 
in the upper half of the t plane and z' will not be single-valued as a function oit unless the 
residue vanishes. Thus we again require an extra condition as for the transformation into 
the outside of a circle. The presence of a singularity not on the path of integration probably 
makes this transformation less useful for external problems than the transformation into 
Sb circle 

To transform the inside or outside of a given polygon into the upper half of the 
t plane, we take the a r equal to the external angles, but still have to find the t r to 
make the lengths of the sides right. This is always possible, but the proof that it is 
possible is difficult, except for a triangle. 

In the Schwarz-Christoffel transformation for a given polygon the points t r for three 
comers can be chosen arbitrarily. For if we take, with o 8 —yf$ # 0, 


t = 


QA + fi 
ys + 8 ’ 


dt 

da 


cl 8—yft 
(ya + £) 2 


( 0 . 8 — yfi) (8 — s r ) dz' _ D 

t ~ tr = (ys + tf)(ys r +£) ’ da ~~ (ys + 8) 2 


"(sr* 


and the factors ya + 8 cancel, leaving the form unaltered. But a, /?, y, 8 can be chosen to 
put three of the a r anywhere we like. It is usually convenient to take them at some of the 
values 0, ± 1, and oo. Evidently the same will be true of the internal transformation into 
a circle. If the polygon has more than three vertices the choice of the values of s r for three 
of them will fix those for the others. 

The external transformation is unique and no similar simplification is possible. The 
relation 2 o r z r = 0 is equivalent to two relations between the ij r> and a factor of modulus 1 
in all the z r will be cancelled by another in the factor A. 

A theory of the extension to curved boundaries is given by J. G. Leathern.* 


EXAMPLES 

1 A straight slit of width 2o and of great length is cut in a large conducting sheet. Show that, when 
the sheet is charged, the field in the neighbourhood of the slit, not too near the ends, can be determined 
by a complex transformation of the form w = c(z a - a 2 ) 1 '’. Show that the surface density o' at a point 
distant x from the central line of the slit varies according to the law 

cr = <r 0 (l-a*lx*)-\ 

and that the equation of the equipotential surface of potential V is 

<M - T - 19430 

2. Show that the resistance between two circular electrodes of equal radius b in an infinite plane 
uniform sheet of material of two-dimensional conductivity <r is approximately 


when the distance c between the electrodes is large compared with b. 

The lines OA, OB of unlimited length form the boundary of a conducting sheet, which occupies 
the angle between them. At P on OA and Q on OB semicircular electrodes of radius b are let into 


* Phil. Trans. A, 215, 1915, 439-87. 
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the sheet, b being small compared with PQ. The angle AOB = a, and OP = OQ = a. Show that the 
resistance between the electrodes is approximately 

2 2aa 

— log-—. (M.T. 1941.) 

nar nb v ' 

3. Electric charge is distributed with density e along the line x — y = 0, and the regions defined 
by \y\>a are occupied by conducting matter at zero potential. Verify that in the space between the 
conductors the potential <f> is given by 

7TZ 

(p + ir/r = — 2elogtanh —, z = x + iy. 


Prove that the charge induced upon unit length of the strip of surface defined by y = a, 0<x<b is 


-tan -1 tanh —. 

7i 4 a 


4. Show that by means of the formulae 


7TZ 

2 a 

7TW 

2 a 


— V ( f2 — 1) — cosh -1 £ + \iri, 
= Ut, 


(Prelim. 1937.) 


the solution can be found for the problem of the flow on one side of a stepped boundary consisting 
of y— —a for x<0, y = a for x>0, and x = 0 for —a<y<a. 

Show that if the pressure at infinity is zero the force on unit width of the transverse portion is 
zero. 


5. Sketch the transformed curves of the axes in the z plane and of the trisectors of the angles 
between them, under the transformation 



(I.C. 1942.) 


6. By considering the transformation w = exp z 2 in the area A defined by — 1 < 1, x 2 — y 2 ^ 1, 

x^O, show that 


f J (x 2 + y 2 ) e 2 (* s -»*)dxdy = \e 2 . 


(I.C. 1936.) 


7. Find a system of curves orthogonal to the curves of the family x 3 — 3xy 2 = e. 





Chapter 14 

FOURIER’S THEOREM 

‘I must go in and out.’ 

Bernard shaw, Heartbreak House 

14 * 01 . Harmonics fitted to n equally spaced values. Let the values of f(x) be 
specified for n equally spaced values of x, namely, 


x r = — = rA (r = 0,1,...,»—1). 
n 

Denote f{x r ) briefly by f r ; we wish to determine coefficients C g so that 

/,-Me". 

s*» 0 


(i) 


( 2 ) 


These are n equations in n unknowns. Multiply the rth equation by exp (— irmX), where 
m is one of 0,1, 1, and add for all values of r. We get 


Now if 8 4= m, 


n— 1 n— 1 n—1 

2 f r e~ irmX = 2 S C a e?« 8 ^K 

r = 0 r=0s=0 

n—1 J gtn(8—m) A 

r to “ 1 — gt‘(s—n*)A 


since wA = 2/r; if s = ra each term is 1 and the sum is n. Hence 

n ^f r e- irmX = nC m . 

r=»0 

To show that these satisfy the original equations, we have, replacing m by 3, 

»—1 ] »— 1 n—1 

sc,«“ = -s s /,«»-«** 

«=* 0 ^ s= 0 r== 0 


(3) 

(4) 

(5) 

( 6 ) 


and the sum of with regard to 3 is 0 unless r = t, when it is n. Hence the sum (6) 

is /„ and (5) is the solution. 

j n— 1 

For 3 = 0, C 8 is simply- 2 fv If m is a value of s and not zero, n — m is another. Hence 
apart from 3 = 0 we can take the terms of (2) in pairs; and 


1 71—1 

C h e iis\ + Q n 8 e il(n-8) A _ _ £ J r | e i«-r)8A + e iff-r)(n-s)A\ 

n r=0 

2 71-1 

= - 2 / r cos (J-r)sA. 

%r=0 

If % is even, 3 = ^% is one possible value and occurs only once; for this 

C a eits\ = 1 2 / r e 1/ani(< ~ r)A = i 2 fr cos $»(*-r)A, 

w-r-o n r =0 


(7) 


(8) 


sin r) A = sin (t — r)n = 0. 


since 
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Thus f t = Ao + ’LAg cos stX + 2i? s sin stX, (9) 

where the summations include s = 1,2,... up to \n or \{n— 1) according as n is even or 
odd; and 

A, = ( 10 ) 

»r=0 

2 «-i 2 n_1 

= - 2/ r cossrA, Bs~~ Ti fr 8 insrA (0<s <£?&), (11) 

w-r=0 Wr=0 

j n—1 J n—1 

and if n is even A lkn — - S f r 008 \nrA = - S (— l) r / r > (12) 

the sine term vanishing since sin rn = 0. 

An alternative method is to assume the form (9) directly and evaluate E/ r , E/ r cosrsA, 
E/ r sinraA. The summations are, however, slightly more difficult by this method. 

In this way we represent the n values exactly as the sums of a constant and n— 1 
trigonometrical terms. The method is known as harmonic analysis and is extensively used 
in the study of observational data. In meteorology, for instance, the pressure, tem¬ 
perature, humidity and so on are recorded at intervals of an hour. From the hourly values 
a harmonic representation can be found for each day, including terms of periods 24,12, 8, 
6, 24/5, ... hours down to 2 hours. But for the last the data will determine only a cosine 
term, since the corresponding sine term vanishes at all the times where there are obser¬ 
vations; to find it we should need a shorter interval between consecutive observations. 
If we extend (9) to fractional values of t we can regard it as an interpolation function. 
Analyses can be carried out for all days over an interval and the results compared to see 
whether the harmonic terms repeat themselves and can therefore be made the basis of 
inferences over longer intervals; for instance, the 24-hourly period in temperature is 
obvious, but its amplitude and phase are found by harmonic analysis, while diurnal and 
semidiurnal periods in the pressure are noteworthy features of the climate in many 
regions. 

Useful two-figure tables for harmonic analysis have been published by H. H. Turner* 
for 9 to 21 intervals. His r is the present r +1. 

14*02. Fourier series. Now suppose that f(x ) is given for all values of x from 0 to 2tt. 
We can increase n indefinitely and thereby make our interpolation function agree with 
f(x) at more and more points, and the interval is 27r/w, while rA = x. Then if the coefficients 
tend to definite limits these will be 

1 f 2w 1 C 2n 1 f 2ff 

A 0 = ^ f(x)dx, A 8 = -\ f{x)co3Sxdx, B s = - f(x)8msxdx, (1) 
27tJo 7TJo nj o 

provided that the integrals exist, since the method of subdivision used is only one of the 
ways that must give the same limit if the integrals exist. It is sufficient that f(x) itself 
shall be integrable; this will imply the existence of all the other integrals. 

Bet g n (x) be the interpolation function obtained as in 14 01 for n intervals. We should 
expect that, when n increases indefinitely, the interpolation function tends to a limit/(a) 
for every value of x. Unfortunately this is difficult to prove directly, and is not even 
always true. Clearly if f(x) is one of the peculiar functions that interest pure mathe- 

* Tables for facilitating the use of Harmonic Analysis, 1913. 
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Statement of Fourier’s theorem 

maticians, such as one that is zero for all rational fractions of 2n and 1 for all irrational 
ones, then since we sample it only at points of the former set the coefficients in g n (x) will 
always be 0 and the limi ting series will vanish for all x; it will therefore disagree with f(x) 
at every irrational fraction of 2tt. The integrals (1) then do not exist in the Riemann sense 
in this case, but they do exist in the Lebesgue sense, and then the series represents the 
function at irrational values of x/2n and not at rational ones. Even if f(x) is continuous 
we cannot infer that g n (x) -rf{x) unless we can also show that the limit exists, and this 
is not always true.* 

It is easiest to proceed to direct study of the series of sines and cosines 

A 0 + 2 ( A n cos nx + B n sin nx), 

where A n and B n are defined by (1). This series is called the Fourier series of f(x). We 
recall that if f(x) is of bounded variation in an interval (a, b), it need not be continuous, 
but f(x —) and f(x +) exist for every x within the interval, and/(a +) and/(6 —) also exist. 
Then we shall show 

(1) Iff(x) is of bounded variation in (0, 2tt), the Fourier series off(x) converges to 

H/(®-)+/(*+)} 

for every interior point of the interval, and at the end points to |{/(0 +) +/(27r —)}. 

(2) A n , B n tend to zero as n increases, at least as fast as 1/n. 

(3) If f(x) is also continuous at all points of the interval, the sum of the Fourier series is 
equal to f(x) at all points of the interval. 

(4) If fix) is differentiable at all points of the interval and f'(x) has bounded variation, 
the coefficients decrease at least as fast as n~ 2 , and similarly for higher derivatives. 

When several derivatives have bounded variation the rapid decrease of the early 
terms makes the series useful for computation. We have had an example in the case 
of the Bernoulli polynomials, for which the expansions given in 12-07 are the Fourier 
expansions. For P A (t) the fourth term of the expansion has an amplitude 1/256 of that of 
the first, so that a few terms of the series give a good idea of the general appearance of 
the function and can even be used for computation if three-figure accuracy is sufficient. 
The Fourier series also has important applications in potential theory; these are 

CO 

shared by what is called the allied series^ 2 (<4 n sin nx — B n cos nx). Somewhat more 

n —1 

severe conditions are needed for the convergence of the allied series. 

To prove the above statements we need a lemma. 


14*03. Riemann’s lemma. If <f>{x) is non-decreasing and bounded in the range atob, 
and A is large, 

rb rb 

<f>(x) cos Axdx and <f>(x) sin Azefcc are O 
J a J a 



For 



cos hxdx 



sinAafP-iP 
Jo / vx = 


sin A xd<f>(x), 

a 


* If/(«) satisfies a Lipschitz condition the statements suggested can be proved: cf. D. Jackson, 
The Theory of Approximation, 1930, 130: A. C. Offord, Duke Math. J. 6, 1940, 505-10. 
f We use here the name introduced by W. H. Young. The term conjugate series is also used. 
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the integral on the right being a Stieltjes integral if <j>{x) changes discontinuously at any 
value of x. All elements 8$(x) are > 0, and | sin Ax | < 1; hence if | <j>(x) | < A 


Also, similarly, 


s: 


4>(x) cos A xdx 

b 


2 2 

W 4 - 


j: 


$(x) sin Xxdx 


4 A 

A * 


4A 

A ’ 


If we change <j>(x) to —<j>{x) the result is seen to be true also if <fi(x) is non-increasing in 
the interval. 

We know that if f(x) is of bounded variation in the range a to b, it can be expressed 
as the sum of a non-increasing and a non-decreasing function, both bounded in the 
interval. Hence for any function of bounded variation 

jj(x) COB Axdx = o(i), £ f(x) sin A xdx = O . 


14*04. Summation of Fourier and allied series. We can write 


1 f 2ir 

f(t) cos n(t — x)dt, 

n J o 

(1) 

l r 2 * 

f(t)smn(t — x)dt. 

U J 0 

(2) 


We denote the sums of the Fourier series and the allied series, up to the terms in nx, by 
S n (x) and T n (x), and take f(x) to be of bounded variation in (0, 2n). Then 

•2 7T 


1 f 2 * 

S n (x)+iT n (x) = ~ I 

1 ra* /I 1 _ e -(n+l)«/~a!)\ 

-Jo /(() (-2 + -T^ e^> )« 


yx 


sin {n + \) (t — x) _ . cos \{t-x) — cos (n +(t - x) ) 
sin£(£ — x) 1 sin£(f — x) ^ 


(3) 


This reduces S n (x) and T n {x) separately to single integrals. We have to study their be¬ 
haviour as n tends to infinity. We notice first that the integrands are finite at all points, 
even at t = x. For S n (x), first exclude an arbitrarily small range x — 8<t<x + 8. Then in 
the remaining ranges /(f) cosec £(f - x) is bounded, and has bounded variation if/(f) 
has. Therefore by Riemann’s lemma the contribution to S n (x) from these ranges tends 

to 0 with increasing n. Also in x-8<t<x + 8, f(t) L —- 1 - -■ - * --) has bounded 

,,w ||(f-x) sin|(f-x)j 

variation. Hence 

1 C x +* sin(w + 4) (t — x) . 

fii-.) «-°- w 

Since 8 is arbitrarily small it follows that that if the series converges its sum depends 
wholly on the values of f(t) near t = x. Also, on putting 

(n + $){t-x) = u, 


1 /*(» + l/2)* / 

£«(*)-- /(* + 
7T J-(n+ i/a)« V 


a ) 

. sinw 


u 


du~> 0. 


(5) 

( 6 ) 
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Summation of allied series 

As n-+ oo the limits tend to ± oo. But as t->x from larger or smaller values, f(t) -+f(x +) 
orf(x—). By our choice of 8 we can make the variation of the values of f(t) on each side 
of x as small as we like, by 1*093. Hence, by du Bois-Reymond’s form of the second mean 
value theorem (1*134), putting 

we can make the integral as near as we like to 

r f{x+) s ^ du+ r f{x -fj^ du = fr{f(z+)+f{x-)}, (?) 

Jo U Jo u 

and therefore 8 n ( x ) Hf( x +) +f(x —)}. (8) 

The series is therefore convergent at any point x such that f(t) tends to definite limits as 
t approaches x on each side. If these limits are the same and equal to f(x), the sum is f(x). 
If they are different, as when/(£) has a finite jump at x, the sum is the mean of the limits. 
If x = 0 or 2 n } (4) must be modified; making the appropriate changes we find 


Now consider 


«„( 0) = SnV”) -+ !{/(« +) +/(2tf -)}. 



cos \{t—x) — cos {n + \)(t—x) _ 
sin-|(£ — x) 


(9) 


This is convergent, but the convergence is only saved by the term in cos (n + l) (t—x), 
which we wish to treat as a remainder term. We can, however, proceed as follows. First 
exclude a range of length 25 about x as before; outside this the part depending on 
cos (n + £) {t — x) tends to 0. Then T n {x) has the same Emit as 


1 r f*-* /*2 jt ~i i rx+s 

2tt [Jo + LJ /(i> “W—>*-SsL/( ( ) 


cos \{t — x) — cos (n +{t — x) 


and the last portion 


= - 2 “ Jo {f( x + v ) ~f( x ~ »)} 


sin^(£ — x) 
cos — cos (n + \)v 


sin^v 


dv. 


dt, (10) 


( 11 ) 


Now if {f(x + v ) —f(x — v)} cosec \v is bounded for 0 ^ v ^ 8 we can again apply Riemann’s 
lemma and say that the term in cos (n + \)v tends to 0. But then, if the upper bound of 
this expression is M, 

JJ {f(x + v)~f(x — v)} cot \vdv\^ M cos \vdv = 2M sin|5, (12) 

which we could make arbitrarily small at the outset by a suitable choice of 8. The con¬ 
dition assumed is true if f{t) is differentiable on each side of x\ the inference would remain 
valid if the derivative was itself discontinuous at x. Then we have the result that if f(t) 
is differentiable on each side of x, the allied series is convergent and its sum is 

— J— lim f" f +(* "]/(/)cot \{t-x)dt = -^-P f f(t)cot\(t-x)dt, (13) 

t-*-Q L J o Jx+8 J 27r Jo 

P denoting the principal value. 


J MP 
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14*041. The Lipschitz condition. If for some positive M and a, and [ v | ^ 

| f( x -j_ —f(x) | ^ M | v |*, it will still be true that (12) can be made arbitrarily small by 

a suitable choice of 8. With this condition the truth of (8) follows immediately from (6) 

poo 

and the fact that sin uduju converges. 

Jo ... . 

If a function satisfies this condition with a = 1, and some M independent of x, it 

also follows immediately that the function is of bounded variation; for the total variation 

is the upper bound of 

\fM-m | + \f(x 2 )-f(x 1 )\ +... + \f(2ir)-f(x n ) | < M S | av-afc-! | = 2 t tM. 

Hence a sufficient condition for the Fourier series to converge to f(x) and the allied series 
to (13) in (0, 2n) is that f(x) shall satisfy a Lipschitz condition of order 1 uniformly; this 
takes in its stride the condition that the variation shall be bounded. 

14*05. Complex theory. We replace x by 6 and t by x> and regard 6 as the argument 
of a complex variable z, of modulus a. We shall show that in suitable conditions the Fourier 
series has a simple relation to the solution of a potential problem, where the potential <j> 
is given over the circle | z \ — a and is required in the interior. We have seen from Laurent’s 
theorem that the supposition that 0 + iijr is analytic in a zone containing the circle leads 
to the Fourier expansion. We shall now see that the condition is not necessary. 

The Fourier and allied series of f(d) can be written together 


I /*2w 1 oo /*2w 

S+iT = ±\ fix)**- 3 ***. ( 1 ) 

Put t — ae { x, z — re ie ( r<a ), (2) 

and consider the series 

1 1 00 fz\ n 

*+»-*!. ***+- w *JM)* (3) 

When r->a the terms reduce to those of (1). Since the series is uniformly convergent in 
any closed interval of r<a, 

* +, *"s/rMs + ?(in** 



t-i-z 
t — z 



I. 


, t + zdt 

fix)T-ZT> 

Q t — Z t 


(4) 


where the integral in the last expression is taken around the circle 1 1 1 = a. This function 
is defined subject simply to f(x) being integrable, and provides us with a precise starting 
point. We wish to study its properties, and in particular to see whether, when \ z\->a. 
Put x~~@ — & an( l separate the real and imaginary parts. We get 


(h-— C” _ - 

** ~ 2n J o a 2 — 2a: 

r - 

nj* a 


-r £ 


2arcos^ + r 2 
arsing 


2 —2arcos^- + r 2 


f{d+&)d&, 

f(d+&)d&. 


(5) 

( 6 ) 


As they are the real and imaginary parts of a function of the complex variable z, these 
functions are solutions of Laplace’s equation in two dimensions everywhere within 
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the circle, and we shall expect <f> to be the potential and i]r the charge function, given that 
4> takes the values f(0) on the boundary and that there is no charge inside. We therefore 
proceed to examine the behaviour of <f> and \jr when r tends to a. Suppose that when x is 
near 0, f(y) tends to limits f(d —) and f(d + ) as x tends to 6 through smaller and larger 
values respectively. Then for any (o we can choose a quantity <5 so that 

|/(0+#)—/(0+) | <g> (0<#<*), 

\f(0+&)-f(d-)\<(o (~8^&<0). 


Then 


■2arcos# + r 2 

When r a the first integral tends to 0. For the third, 

f o V ^ ^ + < f‘ /(«+») ^ 


f(d + &)d&. 


(7) 


( 8 ) 


a 2 —r 2 


2arcos# + ?* 2 
a 2 —r 2 


d& 


■i {f{ ° + )+ " } a*- (9) 


Now, if 


f- 

Jo a 


a 2 — r 2 


2 — 2ar cob# + r 2 


d& 


-J, 


tan£# = u , 
tani/ 2 * (a 2 -r 2 ) 2du 


= 2 tan - 




( 10 ) 


fo (a — r) 2 + (a+r) 2 u? 
which tends to n as r-+a. Hence as r~>a we can make the second integral in (9) lie between 
5 t{/( 0+ ) ±<y}; and by the type of argument already familiar in potential theory <f> has 
a limit as r ->a equal to 

hm ^ = H/(0-)+/(0+)}. 


We break up the range for \jr similarly. The range 8 to 2n - 8 gives 


and the range — 8 to 8 gives 


J r2n~9 

~27TJt ^ + ^ ° 0t 


( 11 ) 


( 12 ) 


arudu 


(1 + u 2 ) {{a - r) 2 + (a + r) 2 u 2 } 

1 rta,n%» ( u 

= „J 0 W+»)-ne-»)}[ T 


(a + r) 2 u 


■ +u 2 (a — r) 2 + (a + r) 2 u 2 \ (13) 

The first part on the right of (13) is independent of r. The second is less n um erically than 


-^du. 


if 

TTJO 


t&TXVzi J 

{/($+&) -/(0-£)}-, 


u 


(14) 


which tends to 0 with 8 iff(6) is differentiable at # = 0. (Actually a Lipschitz condition 
would do; it would be enough that {f{d+&)-f(d-&)}u~* should be bounded, where a 
is any positive number.) Then both parts tend to 0 with 8, and 


1 C 2n 

~2^ P J 0 /(0+#) cot £#<£#, (15) 

which was found for the sum of the allied series on the circle in 14*04. If/(<9+#) -.f(Q—$} 
behaves like l/( - log | u |) the integral (14) will not tend to 0 and the principal value will 
not exist, but such cases do not seem physically important. 
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Thus the procedure of introducing the factors ( r/a) n and letting r tend to a gives the 
same values for the series as direct summation does. By Abel’s theorem this would be 
expected, since we know that if a power series converges at a point on the circle of con¬ 
vergence its sum there is the limit of the sum on approaching that point from within. 
But the present results are true under somewhat wider conditions, for the argument 
merely assumes that f{0) is integrable, not that it has bounded variation. It may 
therefore give a meaning to the series on the circle even though the latter may not 
converge. Our theorem then takes the form: 

If f{0) is integrable , and 

1 /*2 it 1 r*27T I /*2 V 

A 0 = — f{6)dd, A n = - f(0) cos nddd, B n = - f(d) sin nddd, 

*7r J o 77 Jo 77 J o 

00 

then the Fourier series ^ = A 0 + 2 (A n cos nd + B n sin nd) 

n = 1 

00 

and the allied series rjr = 2 (A n sinnd — B n cos nd) 

n = l 

if summed by Abel’s method , give 

0 = i lim {f(d + 8)+f{0-y)j, 

8,7j—>-0 

where 8, t/> 0, at any value of 6 where this limit exists, and 

1 /*2 n 

^r = - —Pj o f(x)ootl(x-d)dx 

for any value of 6 where f (6) satisfies a Lipschitz condition. 

The use of Abel’s theorem can be justified immediately in a large class of cases. The 
conditions in 14-04 are sufficient for convergence on the circle. 

Without previous knowledge of the properties of f{d), something can still be said 
about the convergence if the coefficients in the series are known. By a theorem of 
Tauber the Fourier series converges if the Abel sum exists and nA n and nB n tend to 0; 
by an extension due to Littlewood it is sufficient that they should be bounded. 

So far as this theorem relates to (j> it bears the name of Fourier.* Study of the allied 
series is modern, The present form of the theorem seems to be the one with the most 
direct physical applications. It is known that a Fourier series is always summable, even 
if not convergent, by a method of Cesaro known as (C, 1), which is less drastic than Abel 
summation, at all points where f(d +) and f(6 —) exist; but in potential problems the 
trigonometrical factors are associated with powers of r in such a way as to make Abel 
summation arise naturally, and then the functions (j) and are determined at all internal 
points by the integrals (5) and (6), given the values of (j) over the surface. It would be 

* For no very obvious reason. The problem of the vibrating string, with twice differentiable initial 
displacement, was solved by d’Alembert and Euler in 1747. D. Bernoulli got the solution as a sine 
series in 1753. If the solution is unique the two forms must be equivalent and the Fourier sine theorem 
follows. Fourier, in his Analytical Theory of Heat, 1822, gave an alleged proof, which is a mathematical 
nightmare. The book is an excellent work on heat conduction. The first proof of the theorem under 
reasonably general conditions was due to Dirichlet in 1829. 

f See Hardy and Littlewood, Proc. Lond. Math. Soc. (2) 24, 1925, 211-240. 
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equally valid, since <j> is the function allied to xjr, to take as/(0) the values of xfr over the 
surface provided xjr is differentiable; then (5) will give \jr at internal points and (6) will give 
—(f). If there are local concentrations of charge, ijr will have finite discontinuities and (f> 
will not be determinate there, since the principal value of the integral giving it will not 
exist there, and <j) will tend to infinity logarithmically as we approach a discontinuity 
in xjr. If <f) is discontinuous at a point it would correspond in the electrostatic problem to 
a doublet with its axis tangential, and the charge locally will be indeterminate. Hence 
the special cases that arise in the summations correspond to physical difficulties also. 

The introduction of Abel summation is due to Poisson. 

We have seen that a function analytic within a contour can be determined entirely in 
terms of its values on the contour. The real and imaginary parts, however, satisfy Laplace’s 
equations separately, and therefore by general potential theory each is determined by its 
values on the contour. Further, by Cauchy’s theorem the real and imaginary parts are 
connected even on the contour. The present result shows what this relation is when the 
contour is a circle. An additive constant could be included in either <J) or rjr without up¬ 
setting the Cauchy-Riemann relations. We have arbitrarily taken the constant term in xjr 
as zero. This is physically unimportant because we are not usually much concerned with 
the absolute values of f> and ijr, but only with their differences from place to place, so that 
the constant is usually irrelevant. 


14*051. The cosine and sine series. Suppose that f(x) is such that/(277 — x) = fix). 
Then 

1 f 2 * 1 

A ° = 2nJ 0 f( x ) dx = „J 0 f( x ) dx > (*) 


i r 2w 2 r* 

A n = - f(x) cob nxdx = - fix) cos nxdx, 
nj o ttJo 

1 f 2 * 

B n = - I f[x) sin nx dx = 0. 
n J o 

Hence the Fourier series 

i 0 + Si n cos nx 


will represent f(x) from 0 to n, and/(277 — x) from tt to 277. 
On the other hand, iff( 27 T — x) = —f(x), 


( 2 ) 


(3) 

(4) 


o' 

II 

g 

©~ 

II 

0 

(5) 

2 C” 

. = - f(x)amnxdx, 
nj 0 

(6) 


and the series 2 B n sin nx, with B n determined by (6), will represent f(x) from 0 to n, and 
—f(2n — x) from n to 277. We thus have two representations of the same function valid 
from 0 to 77; but they represent different functions from n to 277. This property is very 
different from that possessed by power series. It can be regarded as a method of con¬ 
tinuation; but if, for instance, f{x) = x from x — 0 to 77, the analytic continuation is x 
in 77 < x < 277, the continuation by (4) is 277 —a;, and that by (5) is — (277 — x). All are correct 
in their proper places, but the decision between them depends on the particular problem, 
which will itself indicate if the function is analytic, or symmetrical or antisymmetrical 
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about x = tt. If/( — x ) — f(x) = f(2n — x) and f(n—x) = —f(x), so that the function is even 
but is anti-symmetrical about \n. 



ro rVa n fir \ 

+ + + f(x)dx\, 

f -ft J -'kit Jo J ) 

(7) 

in which the first and second integrals cancel, also the third and fourth, 




f(x) cos nxdx = 0 for n even, 

0 

( 8 ) 


-a 

(*Va » 

f(x) cos nxdx for n odd, 

0 

(9) 


-B„ = 0 . 


( 10 ) 

Hence 

oo 4 

f(x) — 2 ~ cos ( 2 m + 1 ) x\ f(x) cos (2m + 1 ) xdx. 

m—o rr Jo 

( 11 ) 


Examples. Take f(6) = 1 for 0 <6<n\ then the constant term in the cosine series is 1 
and the rest are 0, and the function is everywhere 1; as it should be since the function 
represented by the cosine satisfies f(2n—6) =f(0) =f(2n + 6). But for the sine series 
we have 

B n = - f sinnOdO = — (1 — cos tut) = 0 or —, (12) 

ttJ o nn ' ’ nn 

according as n is even or odd. Hence 

1 = ^(sin0+|fsin30+£sin50+...) (0 <6<n). (13) 


To carry out Abel summation on the series we have 


S — lim — {(re ie + %r 3 e sid + ...) — ( re~* 0 + \r 3 e~ 3ie + ...)} 

r->l m 


But 

and 


lim — (tanh -1 (re ie ) — tanh -1 (re~ i0 )}. 

r->l 

, . . . tanh x — tanh y 

an (;x y) — tanh x tanh y ’ 

S = lim Atanh- 4^ 
r _^i ni 1 — r z 

.. 2 , _2rsin0 

= lim - tan -1 —- 

r ->i 7T 1-r 2 


(14) 

(15) 

(16) 


(17) 


tan -1 z being many-valued we must take the value that tends to 0 with r, since the series 
on the right of (14) do. Then if sin# > 0 the limit is 1, which verifies the result. But if 
sin0<O the limit is — 1, as we should expect since the sine series represents a function 
antisymmetrical about tt. 

If 0 = 0 or tt, the series vanishes. This agrees with our result that at a point of dis¬ 
continuity the s um of the series is the mean of the limits, here +1 and — 1, on opposite 
sides of it. 
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4 

ijr — — - (cos 0 + !■ cos 30+...), 
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{<f> + i^) r< i = — — {re w + 7 ^e m + ...\ = ——tanh -1 

71 \ 6 ) 7T 


. 2i 1 + re iB 

-1 rer — -log-- 

7 7 ®1—re 


>i0 


2% 1 + r cos 0 + ir sin 6 

-log--- 

it 1 — r cos 6 — ir sin 6 


2i[\. (1 + rcos0) 2 4-r a sin 2 0 ., , 

it \2 °®(1 — t* cos 0) 2 + 7* 2 sin 2 0 an 1 


rsin0 


+ r cos 0 


+ itan 


-l 


r sin 0 ) 
1 —rcos0J 


2i[l l + cos0 .6 ... ,„ 


2 i 

= 1 — — log cot $ 0 . 

We recover the previous value of <j>\ and 


rfr = —log cot |0, 


7r 


which is infinite at the points of discontinuity, as expected. Changing the sign of 6 does 

2 

not alter \Jr, which is therefore in general — log J cot \0 J. 

The points of discontinuity of (j) are seen to correspond to the points z — re id = ± 1, 
where 0 + i\[r has branch points. The behaviour of (f> and rjr at such points is connected 
with the fact that near z — 0, the real part of i log z changes by ± ir as we go half-way round, 
but the imaginary part is logr, which tends to infinity logarithmically. 

Next, take 

f(6) = 0(7 t — 6) 0 < 0 < 71. 

For the cosine series 

4„ = i j’e{«-8)dg = in*, 

A « = | J o 6(71-0) cosndde = _!{! + (_ 1 )»} ( 


/(0) = t 2 — cos 2 Q — 
Notice that since f(Q) -> 0 at 6 — 0, 


cos 4 0 cos 66 
4 9~‘ 


^7T 2 = 1 + 22 +p+‘”> 

a relation that we have found in considering the Bernoulli numbers. 
The sine series is 


6(n-6) = ^si 


. a sin 36 sin 56 
sm 0 + — - Q + . + 


3 3 


5 3 


-)• 


Notice the much more rapid convergence of the sine series. Three terms give an accuracy 
of under 0 - 3 °/ 0 of the maximum, whereas three of the cosine series give errors reaching 
over 4 % of the maximum. The opposite feature was found for the function f(6) = 1, 
where one term of the cosine series was enough but convergence of the sine series was slow. 
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Generally speaking if the function tends to zero at a terminus, convergence is faster for 
the series all of whose terms vanish there; if its derivative is zero, it is better to use the 
series whose terms have zero derivatives there. 

Note that if we put Q = \tt in the sine series we get 

^=32(1-^+...). 


The allied series is 




_ cos 30 
cos 6 + ■■ 33 + 


-)• 


which converges for all values of 0. There are, however, singularities on the unit circle. 


otherwise the combined series 


z 3 z 5 
*+33+55+ 


would converge for some values of | z | > 1, and it does not. From the fact that the sine 
series represents ttO + 0 2 for — /r < 0 < 0 we see that the first derivative with regard to 0 
is continuous, the second discontinuous, and may suspect that <f> + ii/r contains terms of 
the form (z— l) 2 log (z— 1), and analogously (z+ l) 2 log (z +1). 

14-06. Integration of Fourier series. We know that a uniformly convergent series 
of continuous functions represents a continuous function and also that it is integrable 
term by term. The terms in a Fourier series are continuous functions, but we have seen 
that they can add up to a discontinuous function. Hence the series in such cases are not 
uniformly convergent. It may therefore be asked whether the other characteristic pro¬ 
perty of uniformly convergent series, that of being integrable term by term, also fails 
for such series. The answer is that it does not. A Fourier series can always be integrated 
term by term, not even needing to be convergent, and gives the integral of its defining function. 
Let f(x) have the Fourier series in the range 0 to 2 ir 

o 0 + L {a n cosnx + b n smnx). (1) 

Then F(x) = J f(t)dt (2) 

exists, because the fact that f(x) has a Fourier series implies that it is integrable. Also 
if f(x) is integrable F(x) is continuous; and if f(x) is bounded F(x) has bounded 
variation. Then F(x) — a 0 x has a convergent Fourier series, say 

F(x)-a 0 x = A Q - ! r'L{A n cosnx + B n sw.nx) (0<a;<27r), (3) 

and 

C2tt ["1 “12 it J C2n 

nA n = {F(x)-a 0 x} cos nxdx = -{F(x)-a 0 x}sinnx -- sin nx{F'(x)-a 0 }dx 

= — -(* f{x)smnxdx = (n=}=0), (4) 

n J 0 n 

c 2 it r 1 ~l 2,r 1 

nB n = I {F(x) - a Q x} sin nxdx = - -{F(x) - a Q x) cos nx + - J ^ cos nx{F’[x) - a 0 } dx 

= - 1 -{F(2r,)-2na <1 } + 7 ^, 


( 6 ) 





14 061 

Differentiation of Fourier series 

441 

1 f*2ff 

and the first term is 0 by the definition of a 0 as — I fi x ) dx. Hence 



F(x) — a 0 x = A 0 + S^sinwa;-^cos%a:j. 

(6) 

But F{ 0) = 

0; hence 



F(x) = a Q x+lL\^&nnx + -^(\ — cos»ur)|, 

(?) 


which is the result of integrating the Fourier series of f(x) term by term. 

Even if f(x) is unbounded, but if f(x) and |/(#)| have improper Riemann integrals, 
the theorem remains true. 


14*061. Differentiation of Fourier series. The corresponding proposition for 
differentiation is true provided that we understand that the differentiation is to be carried 
out within the original circle; this is valid because <f> + iijr is analytic with regard to z 0 
within the circle; and the argument may be repeated to show that the limit of the deri¬ 
vative as z 0 approaches the circle is the derivative on the circle, and is equal to the derived 
series on the circle at any point where the latter converges. But what usually happens is 
that the first or some later derived series converges nowhere on the circle, even though 
the derivative itself may exist at almost all points. Thus the function equal to 1 for 
0 <6 < 7 t and to — 1 for tt < 6 < 2n gives the derived series 

4 

- (cos 6 + cos 30 + cos 50 +...), 

TT 

which has no obvious meaning, though it is summable by Abel’s method; yet the deri¬ 
vative of the function exists and is 0 except at 0 = tt. In a special sense we may still speak 
of the Fourier series of such a derivative, for even if the simple definition of an integral 
fails we often define an ‘improper’ integral as the limit of a sequence of integrals that 
exclude the exceptional point. But this process can be applied to such a derivative as the 
last; we take ranges that do not include the value tt where the derivative does not exist, 
and then let a terminus tend to tt. In this case all the Fourier coefficients calculated in 
this way will be 0, and the Fourier series will be 0, thus agreeing with the derivative at 
every point but one. Let us suppose, then, that f(x) has a derivative except at isolated 
points, and we want a Fourier series that will represent this derivative at all other points. 
We take for 0<x<tt 

f{x) = A 0 + SA n cosrca; (1) 


and assume that f'(x) satisfies the conditions for having a Fourier expansion except 
possibly at 0 and tt. Then suppose 

/'(«) = sin nx. (2) 


2 rn 2 r “Iff 2% f 

We have b n = - I f'{x) sin nxdx = -1 f{x) sin nx -—J f(x) cos nxdx 


~ —nA„. 


(3) 


Thus, if f(x) has a Fourier sine expansion for 0 < x < tt, it can be found by differentiating 
the cosine expansion of f(x) term by term. 
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we assume 


Stability of fluid heated below 

f(x) = 2 B n 8innx, 

f'(x) = a 0 + 2 a n cos nx. 


Then 


a 0 = ^jj'(x)dx = ^{/(tt)-/(0)}, 

2 2 r "]*- 

«n = - I f'i x ) cos nxdx = - f(x) cosnx I + — f( x ) si 
71 J o Jo+ it J o 
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(4) 

(5) 

( 6 ) 


smnxdx 




(7) 


Thus the Fourier series of the derivative of a sine series is not to be found by term-by-term 
differentiation unless the original function tends to zero at both limits. But if we have the 
limits of f(x) as x tends to 0 and to n we can find the correct coefficients. If we put 

f(0 + )+f(n-) = A, 

/(0 +) —f{rr —) = B, 

we can write 

a 0 = -B/tt, 

a n = nB n — 2 A/tt n odd, 
a n = nB n — 2Bjn n even, 
and (6) will be correct except possibly at 0 and tt. 

14*062. Fluid heated below. This result can sometimes be used in the numerical solution of 
differential equations. When a thin layer of liquid is heated below, it does not become unstable im¬ 
mediately, viscosity and heat conduction together tending to annul any differences of velocity and 
temperature. But when the temperature gradient is large enough ascending currents form in some 
places and descending ones in others, giving a cellular pattern. The temperature is no longer constant 
over horizontal surfaces, being of the form fiz + Z cos lx cos my in the simplest type of solution, where 
z is the height, x and y the horizontal coordinates, and Z a function of z. It is found that when the 
instability first arises Z must satisfy a differential equation of the dimensionless form 

/d 2 V 

Vf 2 " 62 ) Z +/* bZZ = 0 > (1) 

where £ is proportional to z and /i to /?, the undisturbed temperature gradient. The boundary con¬ 
ditions for two perfectly conducting solid boundaries are 


z = 0 , Z* = 0 , Z"'-b*Z' = 0 (£= 0 , 7 r). 


( 2 ) 


We want a value of fi such that the differential equation can be satisfied subject to these six boundary 
conditions with Z not zero everywhere. Since Z — 0 at the boundaries it is natural to nflanmo a sine 
series 

Z — E^sinn^ 

Then Z' = Sn4 n cosn£ 


(3) 

(4) 


by 14-061 (7); and then by 14-061 (3) 
But Z" — 0 at the termini. Hence 


Z" = — Ew 2 ^L n sinn£. 
Z"' — — E n®<4 n cos n£. 


ZW = E n 4 A n sin n£. 


(5) 

( 6 ) 
(7) 




14*062 Fluid heated below 

There is no restriction on Z^ at the boundaries. We therefore put 


-{ZW(0) + ZW(7T)} = A, 

7T 


and 


-{Zto{0)-ZW{1T)} = B, 


Z® =-1- S {n 6 A n —{A y B)} cos n£, 

2 


A or B being taken according as n is odd or even. 

Finally, 

Z (•> = — 2 {n*A n — {A, B) n }sinn£ 

and (j|j — 6 s ) Z+fib*Z = — 2(n a +6 a ) 8 .4 n sinn£+2{(.4,l?)n+/t6 8 4 n }sin»£ = 0. 
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( 8 ) 

(9) 

( 10 ) 

( 11 ) 

( 12 ) 


This is the Fourier series of the zero function on the left and therefore the coefficients of all terms 
must vanish; thus 

{(«* + 6 s ) 8 — fib 2 } A „ = nA n odd 1 
= nB n evenJ 


(13) 


Also we have the third pair of boundary conditions 

Z'" — b 2 Z' = 0, 

whence 2 (n* + 6*) n.A n = 0, 

2 (— ) n (n* + &*) n.4„ = 0. 

Hence the sums of terms with n odd and even vanish separately; and by substitution 

_ n a (n a + 6 a ) 

2 / a , — r» A - 0 n odd » 

(n a + 6 a ) 8 — /ib a 
n a (n a + 6 a ) 

2 - B = 0 n even. 

(n a + 6 a ) s -/ifc a 


(14) 

(16) 

(16) 

(17) 

(18) 


There are thus two distinct types of solution, those depending on odd values of » being symmetrical 
about the median plane £ = \tt, the others antisymmetrical. We wish to find the least value of fi 
for each type. Computation is convenient for the following reason. The variation of any term due 
to a change of /* is by a factor decreasing like w -8 , and the terms themselves decrease like « -a . Hence, 
if we can compute the series obtained by putting fi = 0, the correcting series depending on fi will 
consist of terms that decrease like n -8 , and will therefore be very rapidly convergent at the start. 
Taking the odd solution first, we have 

to>hi, ' 6 = v{^ + 6^ + -}’ (l9) 

„ n a , Id 1\ , 

»odd(n a + 6 a ) a 8 \db b) * 

= g7r ^ tanh \nb + \tt sech a \nb^ = T(b) say. 


whence 


( 20 ) 


This can be computed directly as a function of b; and by subtraction 

n*ub* 

2---+ T(b) = 0. 

(» a + 6 2 ) a {(n a + 6 2 ) 3 — pb*} ~ v 1 


( 21 ) 
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The second term in the series being small we can write this as 

pb* 


(1 + 6 2 ) 2 {(1 + 6 2 ) 3 — /ib 2 } 


where 


K = 


+ T(b)+K = 0, 

n 2 pb 2 


n = 3 , 5 ... ( w2 + & 2 ) 2 {(^ 2 + & 2 ) 3 — pb 2 } 
(1+6 2 ) 3 


/t6 2 = — 


1 2 /(1 2 + 6 2 ) 2 {T(6)+I£}' 

We first neglect X, assume a series of trial values of b, and work out p for each. The.results are 
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( 22 ) 

(23) 

(24) 


b 


A 

A 2 A 3 

Corrected p 

0-90 

0-95 

1-00 

17-836 

17-652 

17-609 

-184 
- 43 
+ 82 
+ 200 

+ 141 ifi 

+ 125 - 1 ? 

17-587 

17-537 

1-05 

1-10 

17-691 

17-891 

+ 118 

17-613 


We now work out the first two terms of K for b — 0-95 and 1-00, using our approximate values of p. 
They are 0-0014 and 0-0016 respectively, T(b) being 0-6177 and 0-4581. Allowing for them in (22) 
and interpolating for the minimum we have p = 17-636 at 6 = 0-995. The parameters a and A used 
in other treatments are nb = 3-14 and iT*p = 1708-2 respectively. Southwell and Pellew, solving 
directly in complex exponential functions, get A = 1707-8. 

The lowest mode with even n can be found similarly ; starting with 


S 

neven 


6 2 

b 2 + n 2 


\n coth \rrb — 


1 

26’ 


(25) 


n 2 , Id 1\ . /l \ 

we get T(b) = S ^ a + &2 )2 = 1 db + b) COth ^ = ® 3r ( 6 Coth ~ nb ~ ^ cosech8 % nb J » 

and, proceeding as before, we find ^ 

^ = l-4/(4 + 6 2 ) 2 {T(6)+X}' 

16/*6 2 ‘ 36/t6 2 


where 


K = 


(16 + 6 2 ) 2 {(16 + 6 2 ) 8 — /ib 2 } (36 + 6 2 ) 2 {(36 + 6 2 ) 3 -^6 2 ) 


+ .... 


(26) 


(27) 

(28) 


Solutions neglecting K are 


b 


b 


1-5 

186-9 

1-7 

182-8 

1-55 

185-1 

1-75 

183-0 

1-6 

183-9 

1-8 

183-5 

1-65 

183-0 

1-85 

184-4 


For 6 = 1-70 the correcting terms K are 0-0040, while T(b) is 0-2213. The corrected fi is then 180-8. 

The interest of even n, pointed out by Southwell and Pellew, is in the fact that in these solutions 
the conditions Z — Z" — 0 are satisfied at £ = \v, with the further condition Z^ — b 2 Z" = 0, which is the 
condition for a free surface, replacing Z'" — b 2 Z' = 0. Hence this solution is the solution for a liquid with 
a perfectly conducting rigid boundary at the bottom, and a depth half that used in the first case. We 
get b and p for a layer with the same depth as in the first case by multiplying by 2 and 16 respectively. 
Restoring also the factor 7T 4 we have A = 1100-6. Southwell and Pellew get A = 1100. 

The problem has a considerable literature.* The possibility of using the rules for finding the Fourier 
series for the derivatives of a function with no singularities within the range was suggested by Dr S. Gold - 
stein. It will be seen that the rapid convergence depends on the fact that the terms of the series Sw 8 
decrease extremely rapidly at the beginning, and on the possibility of combining the parts not involving 
[i into a known function of 6. It is quite convenient to use considering that the solution depends on 
a sixth order equation with two adjustable parameters. 


* Jeffreys, Phil. Mag. (7) 2, 1926, 833-44; Proc. Boy. Soc. A, 118, 1928, 195-208; A. Pellew and 
R. V. Southwell, Proc. Boy. Soc. A, 176, 1940, 312-43. 
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14*07. The Gibbs phenomenon. This is a peculiarity of the sum of a finite number 
of terms of a Fourier series when the function has a simple discontinuity. It is sufficient 
to consider the function that is equal to 1 for 0 <x<tt and. to — 1 when tt<x< 2tt. From 
14-04 (3) the sum of the terms up to those in nx is 


,, , x 1 f ff sin (ra+ £)(*--a) ^ 


_1_ f 2ff sin(n + |) (t- x ) dt 

2 7Tj„ sin \{t-x) 


fsin(w+£) (t—x) 

sin (n + (t + x)\ 

\ sin \{t — x) 

sin l(t+x) j 


1 p-^ sin {n + \)0 1_ f ff+ 

~2 tt)- x sin|0 2nJ x 

= ±[( x -r +x \^^M d 0. 

27 t\J- x ] n - x f Sin \e 


m^i + a;) j 
w+!C sin {n + \)0 


sin§0 


We may regard the first integral as representing the effects of the discontinuity at 0, the 
second of that at i r. For x small the second will be small, since sin \Q is about 1 when 6 
is near t r. For the first we write 

n+\ = m, md = £, 


s *ir 

n * fljo 


sin 

o msinf/2m* 


This is 0 when x = 0, and increases till mx = n. The maximum value then is 


1 P" singdg 2 fu sing „ 
ttj o msin£/2m njo g 

when m is large. If the upper limit was oo the integral would be \n and thus give the 
limit 1, as we should expect. But 


f°°sin£ Posing Posing fusing 

Jo'r^ = Jo^ + J. T^ + Ja.T 



and every term after the first is negative. Hence 



1 It is actually about 1-179.* Hence near t he discontinuity at x = 0 the sum of a finite number 
\ of terms of the Fourier series overshoots the mark appreciably. Increasing the number 
of terms does not remove this peculiarity; it merely shifts it nearer to the discontinuity. 

The explanation is easy. The sum of a finite number of terms of the series is a con¬ 
tinuous function, and the difference between it and/(a:) is orthogonal to all the trigonometric 
terms up to cos nx and sin nx. But for some distance from 0, S n (x) is less than/(a;) because 
f(x) jumps to 1 immediately and S n (x) does not. This difference will make a negative 
contribution to J{>S^(a:) —f(x)}am.mxdx (m ^ n), which must be compensated by a positive 
contribution somewhere else. But it can be compensated approximately for all m not too 
large by having S n (x) >f(x) in an adjacent range. 


* The numerical value is given wrongly in several books. 
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This series could hardly be used for computation, and the Gibbs phenomenon is only 
another warning that Fourier series are not of much use for direct computation unless the 
coefficients decrease at least as fast as n~ z . 


14*08. Weierstrass’s theorem on approximations by polynomials. If a func¬ 
tion is continuous in any finite interval , at the ends of which it has the same value, a finite 
number of harmonic terms can be found such that their sum differs from the function by less 
than e at every point of the interval; and a polynomial can be found with the same property. 
The proof of Fourier’s theorem given in 14-04 assumed the function to have bounded 
variation; in a certain sense we shall see that this assumption is unnecessary. Evidently 
by a linear transformation of the independent variable we can make the interval 0 to 2tt, 
and the function will also be continuous with regard to this variable. Then the conditions 
on f(x) are that it is continuous for 0^x^27t and/(0) = f(2n). Now, since a continuous 
function is uniformly continuous, for a given positive oj we can choose a set of points of 
subdivision 0, x 1} x 2 , ..., x m , 2n such that the upper and lower bounds of f(x) in every 
interval differ by less than (o. In each interval x r to x r+1 take the linear function that agrees 
with f(x) at x r and x r+v Then this function differs from f(x) by not more than w, since it 
always lies between the upper and lower bounds of f(x) in the interval. We thus have a 
function g(x) defined for each interval, continuous at all points, including the points of 
subdivision, and of bounded variation (^ (m+ l)o>) between 0 and 2n. It therefore 
can be expressed as a Fourier series, and it nowhere differs from f(x) by more than (o. The 
introduction of g(x) cuts out small but rapid fluctuations such as those of a: sin 1/a; near 
x = 0, which could make f(x) have infinite total variation without being discontinuous. 

Now consider the contributions to the Fourier coefficients from the range x r to x r+x . 
We have in this range 

g(x) = . ax+b say . 

x r +1 “ x r 


1 f«*+i 1 

—J ^ g(x) dx = — (x r+1 - x r ) {f(x r ) +f(x r+1 )}, 


1 [**+* . . . 1 . ... Xr + 1 a r 

-I g(x)cosnxdx = — aa + oismwa: +—5- 

”Jx, UTT f n 2 7T[_ 


Xr +1 

coswa; 

Jxr 


1 f( x ) — f(x ) 

= — U( x r+l) sin nX r +1 fi x r) ^ nX r ) + ( CO S UX r+1 - COS UX r ), 


mr 


1 f»+i 
Trjxr 


g(x) sin nxdx 

= if( X r+l) 008 nx r +1 f( X r) COS UX f } + (sin UX r+1 - 8W. UX r ). 


When contributions for different intervals are added, the terms in the curled brackets 
all cancel, those from 2 tt cancelling those from 0. Hence 


9( x ) = S TZ ( x r +1 ~ X r) {f( X r+l) +f( x r)} 

r == 0 

. S S /MiMj 

n-1 r^o(x r+ i-X r )n 2 7T 


(cos n(x — x r+1 ) — cos n(x — x r )), 


where x Q = 0, x m+1 — 2 tt. The terms after the constant are all less than some constant 
multiple of 1/n 2 . Hence the series is uniformly convergent. It is therefore possible to 
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choose n so that for every x the sum of terms up to those in nx differs from g(x) by less 
than 0 ) and therefore from/(a;) by less than 2<y. 

We thus have a finite set of harmonic terms. But cos no; and sin nx can be expanded in 
uniformly convergent power series. We expand each harmonic term up to such a power 
of x, say x 8 , that the total error committed in neglecting terms after x? in all terms is less 
than 0 ). Then collecting terms in like powers of x we have a polynomial in a; of degree s 
that nowhere differs from f(x) by as much as 3&>. 

For a given e, we can take (i) = \e\ then the harmonic series up to terms in nx nowhere 
differs from f(x) by as much as fe, and the polynomial nowhere by e. This proves the 
theorem. 

For the polynomial approximation it is unnecessary that/(0) = f(2n). For we can take 
/(*) = ^ {»/(2?r) + (2 7T- x)f{ 0)} + h{x), 

in which the first expression is a polynomial and the second satisfies the conditions imposed 
in the main theorem. 

This theorem is important partly because it makes it possible to replace functions that 
are continuous but have not bounded variation by functions of two of the simplest 
possible types, with a known limit of error. But it will also replace a continuous function 
that is not differentiable by one that has derivatives of all orders, and if this is done many 
proofs can be simplified.® 

The Fourier series for f(x) and g(x) are obviously different, but will agree closely in the 
early terms. The polynomial expression will not in general be identical with the inter¬ 
polation polynomial found from/(()),/(%), ...,/( 2n) by divided differences, since the neg¬ 
lected terms will not vanish exactly at the values in question, and to obtain the requisite 
accuracy it may be necessary to keep more than m +1 terms if f(x) fluctuates rapidly. 
In the latter case numerical interpolation might fail owing to large higher derivatives. 

/* 2 ar 

14*081. Extension of Weierstrass’s approximation theorem. If f(x) dx exists , 

Jo 

the upper and lower bounds off(x) differing by M, then for any e, 8 a finite number of harmonic 
terms can be found such that their sum differs from f(x) by less than e at every point of the 
interval, except possibly within a set of subintervals of total length 8, within which the sum 
nowhere differs from f(x) by more than M + e; and a polynomial can be found with the same 
property. 

We use du Bois-Reymond’s necessary and sufficient condition (1*1011) for the existence 
of the Riemann integral. For an arbitrary w we can enclose all discontinuities of f(x) where 
the leap is ^ <w within a finite number m of subintervals of total length 8, each discontinuity 
being at an interior point of the subinterval, unless it is 0 or 2 tt, when we take the dis¬ 
continuity as an end point. Call these intervals 0. As in 14*08 we divide the remainder of 
the interval (0,2 n) into n subintervals in each of which the leap of the function is less than 
( 0 , and construct a continuous function g(x) by linear interpolation. g(x) is of bounded 
variation, ^nw + mM. In the m subintervals O, | g(x) —f(x) | ^ M ; in the rest, 

\g(x)-f(x)\«o. 

Then we can find a sum of a finite number of harmonic terms, nowhere differing from 
(gx) by more than a), and a polynomial nowhere differing from g(x) by more than 2a>; the 
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sum of harmonic terms therefore differs from f(x) nowhere by more than M + 2&>, and 
except in the subintervals O it nowhere differs from/(a;) by as much as 2(o. For the poly¬ 
nomial we need only replace 2oj by 3a>. Taking o) = \e or (which is independent of 8) 
we have the required results. 

Corollary. Let T be a sum of harmonic terms with the property stated in the above 
theorem, and consider the integral 

(‘2ir 

* / = Jo {f{x)-Tfdx. 

This exists if f(x) has a Riemann integral. Then 

I < (M + e) 2 £ + (2tt — £) e 2 . 

If rj is an arbitrary positive quantity, we can choose e so that 2ne 2 < \r\, and then 8 so that 
(M + e) 2 8 < \r)\ then I <i). Therefore if f(x) has a Riemann integral we can find a sum T 
of a finite number of harmonic terms such that the integral I is arbitrarily small. 

14*09. Approximation by least squares : Parseval’s theorem. Let S' n (x) be any 
finite sum of the form 

S' n (x ) = Aq + S (a4'cosr* +R'sinr*). (1) 

r = l 
n 

and let 8 n (x) = A 0 + £ (A r cos rx + B r sin rx), 

r= 1 

where A r , B r are the Fourier coefficients of /(*). Then if f(x) is integrable 


(• 2n /* 2 jt /* 2 jt /* 2 » 

J o = {/W} 2 ^-2j o f(x)S’ n (x)dx + (2) 

Now all products of the form cosr* sins* have zero integrals; so have all of the forms 
cos rx cos sx and sin rx sin sx with r =f= s. Hence 

/* 2n n 

{S' n (x)} 2 dx = 27 rA' 2 + 77 2 (A? + B?). (3) 

J 0 r-1 

/*2 ir r2it 

Also J f(x)dx = 2ttA q , J f(x) cos rxdx = nA r , J f(x)sinrxdx = nB r , (4) 
f 2ff n 

and therefore f(x)S' n (x)dx = 27 tA 0 A' 0 + 7t'£ (A r A' r +B r B' r ) (5) 

J 0 r = l 

r 2 tt /* 2 jt 

and I {/(x)-S»} 2 cfa = J (i {f(x)} 2 dx + 27,(A’ a 2 -2A 0 A’) 

+ n?,(A' r *-2A T A' T + B?-2B r B' r ). (6) 

r-1 

But A' r *-2A r A' r = (A' r -A r ) 2 -A*, B' 2 -2B r S r = (B' r -B r ) 2 -B? (7) 


r2n r 2w » 

and {/(a) - S;(a)} 2 dx = {f{x)f dx - 2nA 2 - n 2 (A* + B 2 ) 

Jo Jo r=l 

+ 27 HAi - A 0 ) 2 + 7T s {( A' r -A r ) 2 + (B' r - B r )*}. 

r = l 


( 8 ) 
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But the terms involving A' r and B' r in the last expression are all ^ 0 and vanish when 
A' r = A r , B' r = B r . Hence: if we measure the discrepancy between the function f(x) and a 
trigonometrical expression of the form (1 ) by the integral of the square of the difference between 
them , the discrepancy is least if the coefficients are taken to be the Fourier coefficients up to 
A n , B n . This result may be compared with that of our original method of determining the 
coefficients so that the trigonometrical expression would agree exactly with/(a;) at equally 
spaced points. Here we do not attempt to give an exact fit at any specified points, but aim 
at the best fit with the function as a whole, measuring the discrepancy as a whole by the 
integral of its square. We find that the best fit is the Fourier expansion up to any order 
we choose. Results of this type are found for all expressions in terms of orthogonal func¬ 
tions, and are not confined to Fourier expansions. 

Again, let S' n (x) be a function T defined as in the corollary to 14-081, so that for an 


arbitrarily small e 






P 2 n 

Jo 

S' n (x)} 2 dx < e. 


(9) 

Hence 

(*277 

{/(*)- 
J 0 

-S n (x)} 2 dx<e 


(10) 

and therefore 

1 /*2t7 

n I 



|J # {f{x)Ydx-27rAl-7T^{A 2 + B 2 ) 

<e. 

(11) 

Hence 

n 

1 (*277 



Al + i 2 (Ai + B!)-*— {f(x)Ydx. 

r= 1 ATT j 0 

(12) 

This is ParsevaVs 

theorem. 





Note that if f(x) has a Riemann integral and is otherwise unrestricted, it may be of 
unbounded variation, and its discontinuities may not be simple; then its Fourier series 
may not converge or even be summable by Abel’s method for some values of x. 

Iff(x) and {f{x)Y are integrable, even by improper integrals, (8) still follows, and 

f2 n /*2tt n 

{/(*)-$.(*>}*(**- + (13) 

J 0 Jo r=l 

. . 1* 27r 

The left side is not negative, and by hypothesis {/(a;)} 2 dx is finite. Hence the sum of 

n JO 

positive terms 2 (A 2 + Bf) cannot increase without limit as n increases. Thus: if a function 

and its square have improper integrals, the sum of the squares of its Fourier coefficients is 
convergent. 

As a corollary, if f{x) and {f(x)} 2 are integrable from 0 to 2/r, even by improper integrals, 

(*277 7*277 

J o f( x ) cos nxdx andl f(x) sin nxdx tend to zero. This fact is made the basis of some 

modern treatments of Fourier’s theorem and the trigonometric interpolation poly¬ 
nomials.* 

14*10. Harmonic analysis: correction for averaging. In a modification of the 
problem of 14-01 that often occurs in practice the data are not the actual values of f(x) 
at x r = rA, but means of f(x) over ranges centred on x r . In studying the diurnal variation 
of atmospheric pressure, for instance, it would be natural to read the barometer at hourly 

* Dunham Jackson, The Theory of Approximation, 1930; Fourier Series and Orthogonal Poly¬ 
nomials, 1941. 
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intervals, and we should then have the conditions of 14*01. But in studying wind the data 
would be the runs of the anemometer in each hour, and therefore the mean wind for each 
hour. The same formulae could be fitted to such data, but would not give the best deter¬ 
mination of the harmonic components of the wind itself. In fact if we take the mean of 
e*** over the range (r — |) A to (r +£) A, it is 


1 r«r+i 

(r-i 


e isx dx = 


2 sin isA 


and harmonic analysis applied to the mean values will underestimate the coefficients by 
a factor (2/sA) sin |sA. The coefficients found from mean values must therefore be divided 
by this quantity to give the harmonic development off(x). 


14*101. Empirical periodicities: the periodogram. The method of 14*01 fits a 
set of harmonic functions exactly to a finite number of values of a function. If the solution 
is to be used outside the original range (e.g. for prediction) it will be periodic, all terms 
being periodic in the interval used for analysis. But the function may be a periodic one 
with a period that is not a submultiple of the interval used, and we may wish to determine 
its amplitude and phase and possibly even the period itself. In the theory of the tides, for 
instance, we know from general theory that there must be harmonic components with 
periods calculable from the rates of the earth’s rotation and revolution and the moon’s 
orbital period, but the amplitudes and phases cannot be calculated on account of the 
complicated form of the ocean. What can be done, however, is to instal a tide gauge in 
each harbour where predictions are required. This records the tide height at regular in¬ 
tervals, often hourly. The amplitude and phase associated with each period can then be 
determined so as to fit the observations, and once these are known they provide a basis 
of prediction. A year’s observations suffice to make predictions as far ahead as we like 
(unless of course the harbour silts up). This semi-empirical method, the periods being taken 
from astronomy and the amplitudes and phases from direct observation, was introduced 
by Sir G. H. Darwin and is systematically carried out at the Liverpool Tidal Institute 
under Prof. J. Proudman and Dr A. T. Doodson. 

If only one period was concerned the calculation would be simple; the coefficients in 
an expression of the form a + A cos yt + B siny£ could be found from only three observa¬ 
tions, though more would be combined by the method of least squares for greater accuracy. 
Actually the tide contains 7 lunar and 7 solar components of incommensurable periods, 
and a method of successive approximation has to be used. The interval is chosen so as to 
be as nearly as possible a multiple of the two periods with the largest amplitudes, and the 
coefficients are estimated from the formulae 

A E cos 2 yt = £/(f) cos yt, B S sin® yt = S /(£) sin yt, 

summation being over the times of observation. With this choice of interval the terms in 
/(£) arising from one pair of components will produce a negligible effect on the estimates of 
those from the other pair. The contributions from the largest terms are then subtracted 
from the observed values, and then the residuals are analysed for further terms. Since 
these will not in general repeat themselves exactly in the original interval it may be 
necessary, after determining all the terms that should be possible theoretically, to return 
to the start and determine corrections to the largest terms from the residuals. 
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When a period is suspected but not already known, the estimation is much more 
difficult. Suppose that in 14*01 

f(x) = C exp i(yx + a), 

where G and a are real and y is not an integer. If we work out C m from 14*01 (5) we get 
nC m = Ce^Zexpl^iy-m)} 

. r-0 l n ) 

= Cc** 1-exp ( 2 ^(y-^)} 

1 — exp (2m (y — m)/n] 

Evidently | G m | is largest when m is as near as possible toy. If m is taken to be the integral 
part of y, and therefore m +1 to be the smallest integer greater than y, the corresponding 
terms in the harmonic analysis will be revealed by having the largest coefficients, and 
their phases at x — 0 will be nearly opposite. Taking the real part and now taking f(x) to 
be G cos (yx + a) we shall have for the largest terms 

The ratio of the amplitudes then gives an equation for y, and G is determined. The phase 
of either term then determines a. 

This kind of analysis is most used for the detection of natural periodicities, of which 
perhaps the best-known is the sunspot period. In practice it is complicated by irregular 
disturbances, so that the actual variation is not simple harmonic though the greater part 
of it may be represented by a few harmonic terms. If there are n observed values y r with 
mean A 0 , let 

2(y r -^ 0 ) a = (w-i)s 2 . 

But any harmonic coefficient A m or B m (m ^ 1) would contribute \A^ or to the mean 
square, and there are n— 1 of them. (It is easy to verify that all product terms cancel.) 
Thus the average of \A 2 m or \B 2 m is s 2 /(n -1). This would be true even if the y r were a wholly 
random set. Consequently a set of harmonic coefficients by itself gives no evidence that 
the periods can be used for prediction unless some of the amplitudes have squares much 
greater than 4s 2 /(w— 1). In the method just described it is best to rely on the largest 
coefficients because these are less affected proportionally by any irregular variation that 
may be present. 

The best way of estimating the uncertainty introduced by the irregular variation 
depends on the circumstances.* One that often succeeds is to divide the range up into 
equal stretches and do a separate analysis for each. If each stretch contains n observations 
the phase of a harmonic with period njy will increase by 2ny from one stretch to the next, 
and if we take the terms in cosraa; and sin mx in the analysis together the phase will 

* Cf. Jeffreys, Theory of Probability, pp. 291-5; M.N.R.A.8. 100, 1940,139-55; Oerlanda Beitr. z. 
Oeophysik, 53, 1938, 111-39. 
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increase by 2ir(y—m). Hence the determinations of the phase for several stretches give 
a set of linear equations for y and a, which can be solved by the method of least squares, 
and the residuals lead to an estimate of uncertainty. Without some such precaution 
periodicities found by harmonic analysis and not predicted by previous theoretical con¬ 
siderations should be mistrusted, as many complications are capable of giving spurious 
periods; not more than a tenth of those that have been asserted will bear a proper statistical 
examination. 

14*11. Fourier’s integral theorem: preliminary discussion. In Fourier’s 
theorem in the form 

f( x ) = wz f f(u)du+ £ \ f f(u) cos n(u-x)dx (1) 

J -it » = 1 "J-n 

(subject to certain restrictions already stated) put 


x = X/T, u = U/T, f(x) = F(X). (2) 

1 C‘ nT 00 1 C nT rt 

Then F{X) = ^,j^F(U)dU+j:^j^F(U)co S %(U-X)dU. (3) 

This extend? the theorem to an arbitrary interval. Now take T very large, and put 

n/T = k. ( 4 ) 

Then the values of k for consecutive terms differ by l/T, and the first term will tend to 

f* 00 

zero if I F(U)dU converges. Hence we may expect that 
J —00 

F(Z) = if°°d/cr F(U) cos K(U-X)dU (5) 

rrj o J-oo 

for values of X where F(X) is continuous, and that at simple discontinuities of F(X) the 
repeated integral will be equal to %{F(X +) + F(X —)}. This is Fourier’s integral theorem. 
But we may also expect that the occurrence of a repeated infinite integral will introduce 
new problems of convergence. The seriousness of these may be seen from the example 

F(X) — cos aX. (6) 


Here the integral with regard to U is infinite for k = a and indeterminate for all other 
values of k, and the repeated integral is meaningless. This breakdown in the simplest 
possible case may serve as a warning against the common belief among physicists that 
Fourier’s integral theorem is easy. 

Even if F(X) =-^sinocX, (7) 

although j F(X) dX exists, the integral with regard to U is infinite for k = a. 
Convergence of F(X) dX is therefore not a sufficient condition for the truth of the 
theorem. 

/*co 

If, however, | F(X) \ dX exists, the integral with regard to U will converge uniformly 

J —00 

for all k; and as for the series theorem, if the repeated integral is to have a definite value 
for all X, we shall expect to need a further condition, such as that F(X) has bounded 
variation, and therefore that its discontinuities, if any, are simple. 
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Potential given over a plane 

14*111. A related potential problem. We have seen (6-093) that if a potential 
function $ is given over a plane and x is the normal distance of a point P from the plane, 
all charges being either on the plane or on the side of it remote from P, and if $ satisfies 
some suitable condition at large distances, 

2tt(J) p = jjj~xdS. ( 1 ) 

If Q is (0, y, £), and P is {x, y, z) and if is a function of y only, sa yf(y), 


** r = 



_ o f" */(<?) ... 

J _ 00 # 2 + (?? — 2/) 2 V ’ 

(2) 

Also 

f 00 X 

Jo w U> x* + {y-y)* 

(3) 

and therefore (2) is 

equivalent to 



1 /-oo /*00 

<j>p — - f{y)dy\ er KX eosK(y—y)dK. 

77 J- * Jo 

(4) 


If we reverse the order of integration and then put x = 0 we get Fourier’s integral theorem 
again. But we clearly cannot put x = 0 before reversing the order of integration. On the 
other hand if x>0, 14-111 (2) exists in much wider conditions than 14-11 (5); it exists 

if/ (V) is integrable over any finite interval and if Y exists such that j^&Qdy and j~ Y (Mdy 

converge. In fact/(^) might be y cos ay or y sin ay, so far as we can see at present. If then 
(2) tends to a limit/(y) when x tends to 0, we shall have a much more general theorem. 
This approach is of further interest because 

cc X 

and therefore there will be, in suitable conditions, a function \Jr P allied to <f> P so that 
<f> P + if P is an analytic function of the complex variable x + iy, and if <f> P is a potential 
is the corresponding charge function. We therefore consider also the function 

, i f 00 (y — v) f(7j) i r°° 

fp = + = n)_J (v)dr > J 0 e-“sin K ( V -y)d K . (6) 

Clearly <f> P may exist in cases where \Jr P does not. The analogue of 14-11 (5) will be the 
repeated integral 

1 f c0 P«* 

-J Q dK J _ J(V) sin K(y - y) dy. (7) 

The allied function was introduced by Titchmarsh;* besides its applications in two- 
dimensional potential theory, it is needed in the discussion of the linear response of a 
recording instrument (14-15). 

* Proc. Lond. Math. Soc. (2) 1926, 109-30. But cf. Lamb, Phil . Trans. A, 203, 1904, 26. 
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14*112. Proof of Fourier’s integral theorem. If f(y) is of bounded variation, and 
1 f(y) I ** integrable from —oo toco, then 

- f dK (* f(7j) cos k{ij — y)d7i = %{f(y +) +f(y -)}. 

ft J 0 J -oo 

poo 

In the conditions stated f{ 7 j)coaK(rj—y)drj is uniformly convergent by the M test, 
and therefore for any finite A 

I A - J dK J f(rj) cos k( 7]-y)drj = J f(7j) drj cos k(t} - y) die 


■£ 


o y-y 


(i) 


The rest of the argument is substantially the same as for the summation of a Fourier 
series. We write 

i A =^{i(n+e)+f{y-e))^f^de+^{S(y+e)^j{y-6)}^f^de, ( 2 ) 

where 8 is independent of A. Then sinceis of bounded variation in 8 ^ 0 < a> the 

second integral tends to zero as A ->oo, by Riemann’s lemma; and the first tends to 
\ft{f(y +) +f(y -)}, as in the proof for the series theorem. Hence the theorem follows. 

The corresponding theorem for the allied integral is as follows. If f(y) satisfies the same 
conditions as in Fourier's integral theorem, then at any y where a Lipschitz condition is 
satisfied 

; J>J 

We have similarly 

J A = f(i/)emK(ii-y)dii = sin K(ti~y)dK, 


-I 


r 


y-y 


(3) 


By the Lipschitz condition, for some M and a (possibly depending on y), where 0 < a < 1, 

I f(y)-f(y) I < M | 7} - y |«. (4) 

Then 

J A =J o , {/(y+g)-/(y-d)} 1 ~7^ e <ie+JJ{/ ( y + g>-/(y-g>}—' *e. (6) 

The first integral has modulus less than 4 MS^/a; choose d so that this is less than 0 ). This 
choice of $ is independent of A. Then as A ->oo, 




(«> 


and therefore by choosing A suitably we can make 




< 2 ( 0 . 


(V 



14*113 Use of Abel summation 

1 pco poo 1 poo fjf) 

Hence - d/c f(y) sin ie(y -y)dy = - lim {f(y + 0) -f{y - 0)} - 5 - 

7T J 0 J -oo 7T8-+0ja 0 
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( 8 ) 


In particular this is true iff(y) is differentiable at y = y. 


14*113. Extension of Fourier’s integral theorem. Iff(y) is bounded for all y and 

integrable in any finite range of y> the integral 


= dy\ f(y)e~ KX 
71 J -00 Jo 


cos K(y—y)dfe 


is a solution of Laplace's equation for x>0; and when x tends to 0 the integral tends to 
MJ(y +) +f(y ~)} f m an y value of y such that the latter expression exists. 

The integral is equal to 


<f>{x>y) = \\ f(y) 

77 J -00 


xdy 


x 2 + (y-y) 2 ‘ 


( 1 ) 


This is uniformly convergent, by the M test, in any interval of x and y such that x^c>0. 
The integrals obtained by differentiating once and twice under the integral sign are also 
uniformly convergent, and therefore differentiation under the integral sign is permissible. 
Carrying out the differentiation we show that 


Next, put 


d 2 <P(x,y) 3 2 <j>{x,y) 
dx* 0 y 2 ~ 


(x >c > 0). 


y = y+xta,n0. 


, 1 pVaw 

Then <f>{x,y) = - I f(y + x tan 6) dd. 

77 J-Van 


( 2 ) 

(3) 

(4) 


It follows immediately that if f(y) has upper and lower bounds M and m, then for all. 
positive x,m ^ </>(x, y) ^ M. 

Suppose that f(y) is continuous on both sides of y = y, but not necessarily at y. We 
can choose a range 8 of y such that for 0< x tan 0^8 


\f{y + xtim0)-f(y+)\<( 1 ), \ f(y—xtan0) —f(y — ) | <a>, (5) 


where 0 ) is arbitrarily small; and 


J /*tan ~ l 8lx 

y) = - I {f(y + % tan 0) +f(y - x tan 0)} d0 

l rVi* 

+ ~ {f(.y + x tan 0) +f(y — x tan 0)} d0. (6) 

The first term is - [{f(y +) +f(y -)} + 2Aw] tan” 1 -, (7) 

7T X 

where | A j < 1; and when x -> 0 this tends to 

W(y+)+f(y-)} (8) 

with an error < A ( 0 , which is arbitrarily small. 
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Abel summation of allied integral 


14113 


The second term in (6) tends to 0 if f(y) is bounded, since the lower limit tends to \n. 
HenC6 lim f>{x, y) = l{f(y +) +f(y - )}. (9) 

x->0 

The corresponding theorem for the allied function is: if f(y) is bounded for all y, and 

roo fly) C~ ¥ f{y) 

integrable over any finite range of y, and if, for some 'positive F,J '-^-dy and J dy 

converge, then for x > 0 the integral 

l |*oo /*oo 

rjr{x, y) = - dy f(y) e~ KX sin /e(y - y) dtc 

ttJ-oo J* 

is the imaginary part of an analytic function of x + iy, of which <p(x, y) is the real part; and 
when x-+Q roo , 

Mx t y) + ±P\ /M- 1 

« J — oo 7 


■y 


at any value of y where f(ij) satisfies a Lipschitz condition. 
We have for x>0 . 

which is convergent in the conditions stated; and 


( 10 ) 


which is an analytic function of x + iy. 
Choose a small quantity $ such that 


-i f/W; dv 

n J —co 


j: 


x + iy — iy 


dd 


\f{y + 0)-f{y-6)\-Q<a)-, 
f o ° 

this can be done because f(y) satisfies a Lipschitz condition. Then 




and 


{f(y+S)-f{y-0)}^ TW ,d0 

{f(y+e)-f{y-e)} ff-gz de - J J {/& + e ) -/(s' - 


dd 

6 


( 11 ) 

( 12 ) 

(13) 

(14) 


since the conditions of Abel’s test for uniform convergence are satisfied for all x<8. 
The theorem follows immediately. 

The conditions stated for these two theorems can be appreciably relaxed. If 


f{y) = y cos My ° r y g i n a y 

it is easy to show by contour integration that the results for f>(x, y) remain true. But the 

most important extensions are probably to the cases where/(y) tends to finite limits as 

y-+± oo. If f(y) = A, we get <j>(x,y) = A, and the allied function is a constant by the 

Cauchy-Riemann relations. If f(y) = A (y>0),f(y) = —A (y < 0), 

,, „ 2 A , .« 
f>(x,y) = —tan- 1 ^ 

2 a 

and the allied function is-log ( x 2 + i/ 2 ) 1/a . 

1T 
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con- 


14*i2-14*121 Cosine and sine transforms 

/•oo f(y\ _ (j I* - ? f(y) — J) 

Hence if /(y)-> C(y ->oo), f(y)^-D(y-+-co), and J ^-- dy, J _^ “ dy 

verge for some positive 7, we can still find an allied function by applying the method to 

g(y)=f(y)~ G (y >°) 

=f{y)-D (y< °) 

and adding \{0 + D) + ^—^tan-^to <f>, -^r^log(* 2 + y 2 ) 1/2 to f. 

14*12. The cosine and sine transforms. If f{y) is given from y = 0 to oo and if 
f( — y) =f(y), we have, subject to convergence conditions as in 14* 112, 


o /•<» 

f(y) = - J /(y) cos Kr i cos K v d 7 i- 


Similarly if /(- y) = - /(y), 

/(w) = - | dK I /(y)sinKy sin Kydy. 
nj o Jo 

These rules can be stated as follows. If 


If 


9(k) = 
Ji(k) = 



njj 0 



nyjo 


/(y)cosieydy, then /(y) = 
/(y) sin/cydy, then /(y) = 


;r/Jo 

2 


.7T/J0 


gr(/c) cos fcyd/c. 
h(K)mn.KydK. 


( 1 ) 

( 2 ) 

(3) 

(4) 


These results show a complete symmetry between f(y) and its two transforms g(tc) and 
h(K). The latter are known as the Fourier cosine and sine transforms; they constitute two 
of the earliest solutions of integral equations, and were given as such by Laplace. Fourier’s 
integral theorem in its general form can similarly be written: if 


poo i r°° 

v{k, y) = /(y)cos k{tj -y)dy, then /(y) -- j # v(k, y) d / c , 

or, more symmetrically, if ^ _ l f” /()?) e -i„ dVt 

V(2fl) J —oo 


(5) 


then 




14*121. Parseval’s theorem for integrals. By (3), 

r AW AW dy = J (|) j°AW */J o fcW 008 *»<** 


■j: 


9l(K)g 2 {K)dK t 


on replacing y by y and reversing the order of integration. Similarly 

JJ/i(y)/ 2 (y)^ = J o KWKWdx. 


( 6 ) 


( 7 ) 
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Fourier-Mellin theorem 


ttfiiy) = My) =f(y) we have 


f f 2 (y)dy= f g 2 (fc) d,K = f h 2 (K)dic. . , . (8) 

Jo Jo Jo 

This is the analogue for integrals of Parseval’s theorem for series. For a vibrating 
dynamical system both theorems have the physical interpretation that the total energy 
of the system is the sum of the energies in the normal modes. 


The Fourier-Mellin theorem. If H (f) is the Heaviside uni t function the s um 


/(Ti) {H (t -rJ-Hit- r 2 )} +/(r 2 ) {H(t -r 2 )-H(t- r 3 )} +... +/(r r ) {H(t - r r ) - H(t - r r+1 )}, 

U) 

where t 1 <t 2 < ... < r r+1 , is equal to /(r f) for r x < t < r 2 , /(r 2 ) for r 2 < f < r 3 and so on. Pro¬ 
ceeding to the limit when the values become indefinitely close, if /(f) is continuous, we 
get the Stieltjes integral 

/(*) = - r f(T)dH(t~T). (2) 

J r — — oo 


We can suppose that/(f) = 0 for t < 0. Now substitute for H(t - r) in terms of Bromwich’s 
integral. 

° „ r°° 1 dz l /* c +^ co r°° 

m = ~L f(T)d ML„ - asL. 7*/. 


assuming that we may invert the order of integration and then differentiate under the 
integral sign. Then if for some c> 0, and di(z) = c, 


J o 2 /(r) e~* r dr = F(z), (4) 

F(p)H(t)=f(t)H(t). (5) 

This gives a rule for deriving an operator for any function f(t) such that the integral (4) 
exists for ^ c. (4) is called the Laplace transform of f(t); this name is often given 
to F(z)/z. 

To justify this, put z = c + iy. Then the second integral in (3) is 


^ J_ J* dyj Q fir ) e_CT e iv(<_T) dT = ^jj_ oo d yj 0 /( T ) e_CT 008 y(* ~ T )dr (6) 

and (3) is equivalent to ^ <qo 

f{t)e-<* = - I dy\ f(T)e~ CT cosy(t-T)dr. (7) 

rrj o Jo 

But this is Fourier’s integral theorem applied to the function that is 0 for t< 0 and equal 
to /(f) for f > 0. If this function satisfies the conditions for the theorem for some 
positive value of c, the operator corresponding to/(f) can therefore be found by (4). The 
result (7) is true for all c large enough for the integral (4) to converge absolutely, 

/•oo 

that is, for |/(r) | e _CT dr to converge. F{z) found by this method will be analytic 

for SR(«) > c, but will not in general satisfy the rule of our fundamental operators that it 
shall have an expansion in negative powers of z for all z with sufficiently large modulus. 
Consequently it has been held by some recent writers that the operational method should 
be abandoned, or indeed that it has been already abandoned, and replaced by the Fourier- 
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Mellin theorem.* This, however, seems to be a fundamental mistake in method. The 
operational method considers the state of a system given at time 0, and obtains a solution 
for its state at time t depending at no point on any information except about its state at 
time 0 and the disturbances acting between time 0 and t. The equation (4) is meaningless 
unless we know the function at all times up to infinity. It is true that the values at times 
later than t will often not affect the state at time t. But if /(f) = exp (at 2 ) with a positive 
the integral diverges whatever c may be; and we are never in a position in an actual 
experiment to verify that this does not happen for t large enough. Even without such 
disturbances, it is a common occurrence for second-order terms neglected in the equations 
to rise into importance if the time is long enough, and it is not justifiable to adopt a 
procedure that assumes the equations to be linear for all time; the whole theory of equi- 
partition of energy rests on this fact. The operational method avoids all such complications 
by ma.THng the procedure depend directly on the data up to time t and on nothing else. 
The Fourier-MeUin procedure makes the validity of the solution depend on the superfluous 
extra condition that future disturbances are not of such a character as to make the 
integral in (4) diverge for every c. It is a valuable method in its proper place, as has par¬ 
ticularly been shown by van der Pol, since when we know that the solution of a differential 
equation satisfies the conditions for the applicability of the method it can often be 
obtained in a form immediately adapted to contour integration. The use of p as a notation 
for a complex variable in this method is however to be deprecated. The method is not a 
substitute for the operational method, since each is valid in conditions where the other 
is not, and nothing but confusion can arise from mixin g the notations. The usual symbol 
z for a complex variable is available, and so is Bromwich’s A, and there is no occasion to 
use the preoccupied p. 

The practice of denoting what we have called F(z)lz by F(z) is to be especially depre¬ 
cated. This function differs from/(i) in dimensions, and gives rise to needless trouble in 
checking. The point has been made explicitly by McLachlan. 

14*14. Harmonic oscillation of finite duration. Let 

f(t) = coayt (-\T<t<\T), 

= 0 (t< -\T, t>%T). 


We wish to express f(t) as a Fourier integral. We find 

poo p l hT 

f(r) cos K(r-t) dr = cosyr gob K(t-r) dr 

J — oo J — Va T 


and 




/ sin$(y-*)y t sin|(y+*m 

\ y-K y + K ) 

sinitr+^m oos 


COS Kt 


y-K 


y + K ) 


dx 


* From remarks of some enthusiasts one might infer that if asked to solve — = cos t, they proceed 

as follows: first form the Laplace transform of cos t; multiply by 1/z; substitute the result for F{z)/z 
in the Bromwich integral, and evaluate by contour integration. 

If f(t) is unbounded near t = 0, but has an improper Riemann integral, the operational method 
needs no special justification. The proof of the Fourier-Mellin theorem, on the other hand, meets 
with new difficulties when the function is unbounded. See also 14* 13 a. 
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Thus the Fourier representation of a harmonic oscillation of fini te duration includes an 
infinite range of frequencies. If we take y positive the second term will be small compared 
with the first for values of k near y; the first has a maximum equal to \T for k = y, and 
vanishes first for K—y=± 2tt/T. There is a pair of minima at y± 3n/T, about —0*2 of 
the first maximum. Thus the larger amplitudes are concentrated about k = y provided 
that yT is large; but for any finite T there will be a finite range of k and y such that the 
amplitude is not negligible, and this will be shown, in the optical case, by a broadening of 
the spectral line. 


14*15. Response of a recording instrument. Instruments have usually to be 
tested to find out whether their actual behaviour is in accordance with that intended. 
A common method of testing an instrument for recording vibrations, such as a seismograph, 
is to apply known harmonic disturbances of different periods. For each period the lag 
and the magnification are recorded when they have become steady. The instrument to 
be useful must be stable and damped; its response may satisfy a differential equation of 
the second order, but need not do so, but if solutions of the form e A< are possible in the 
absence of disturbance, all the values of A have negative real parts. If a disturbance is 
cos yt, the response is cos (yt — e), where fi and e will be functions of y. This may be 
written by saying that the response to eW+e-*? 1 is /i(e i y t ~ ie + e-^+ l ' e ) or that for all y, 
fie~ ie is a function of y, ju, being an even and e an odd function of y. If we take iy — A, 
fiQ—ie is an analytic function of A when A has a positive real part, and the two functions 
A cose and —fi sine are related in the same way as the electrostatic potential and charge 
function. Hence byl4*112(8), 

-/i sine = l - {{/i cos e) y+v - {fi cos, 

and since i/ie~ ie is another analytic function 

cos e = i {{fi sin e) Y+1> - {/i sin e) y _ v ) ^. 

These are relations between observed quantities and can be used either as a check on the 
hypothesis of linear response or to use the observed values of [i and e to improve each 
other’s accuracy. As a rule /i-> 0 with y (very long period). For y very large [i may tend to 
a finite limit /i Q as for a damped pendulum, or to 0 as for a Galitzin seismograph. It can 
be shown* that if the disturbance is H{t) the response is 0 for t < 0, and for t > 0 it is 


1 f 00 dy 1 f 00 dv 

fisin(yt — e)— = fi 0 H(t)+ -J {(/*cose —/«„) siny£+/tsine(l — cosy£)}—. 


EXAMPLES 


1. Show that if O^x^n, 

. 2 4 /cos 2x cos cos 6a; \ 

sma; =- ■■■ + r -= - + — - + 

77 77 \ 1.3 3.5 5.7 / 

What function does the series represent in the range — 77 ^a;< 0? (Prelim. 1936.) 

2. Show that the series E a" cos b n nx, where ) a | < 1, represents a continuous function of x, but 
that its derivative, if any, has no Fourier expansion in0<a;<lif6isan integer and | a& [ > 1. What 

* Jeffreys, Phil. Mag. (7) 30, 1940, 165-7. A term — /i 0 sin yt has been omitted from equation (29) 
of the paper. 
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can be inferred about the derivative from this result? (Actually this series, though continuous for 
all real x, can be proved by other methods to have no derivative for any value of x in the conditions 
stated. Cf. Hardy, Trans. Amer. Math. Soc. 17, 1916, 301—25.) 


3. Find the cosine and sine expansions of cos \x and sin \x in the range 0<x<7T, and verify that 
they satisfy the rules for differentiating Fourier series. 

4. Prove that for 0 < d < tt 

sin0 + §sin20 + £sin30 + ... = \tt—\6. 

5. If f{n) is given for every positive or negative integral value of n, prove that (1) subj ect to suitable 
convergence conditions the function 


, , IS. sin7r(a5 — n) 

g(x) = - S f(n) J —' 

IT n——oo X n 


-tfli 


S cos p(x — n)f(n)\ dp 


is equal to f(n) for x — n, 

(2) f g\x) dx=Y, f 2 {n). 

V —oo —oo 

(Whittaker, Proc. B.S. Edin. 35, 1915, 181-94.) 

f(x) = A 0 + SA n cos na; + S sin na; (0<*<27r), 
f'(x) = a 0 + So n coswa; + S& B sinna; (0<a5<2?r), 

A =/( 2 tt -)-/(0 + ), 


6. If 


A A 

a o = a n = — + nB n . b n = -nA„. 
ztt tt 

fix) = A 0 + ’E(A n cosnx+ 5 B sinn®), 
g(x) = (7 0 + S ((7 n cos nx 4- D n sin nx), 

1 f 2 * 

5 ^ I fix) g(x) dx = A 0 (7 0 +1E {A n G n + B n D n ), 

o 


show that 

7. If 

prove that „ , 

2 ttJ 

irnder conditions to be stated. 

8. Prove that if g(x) is the function fitted tof(x) as in 14*01, then according as n is odd or even. 


.. /, ./ 2m\\ 

e{x) -*,-/U : (*• 008 * rv 

sin £ j x -1 11 

\ n J 


2rn\ 

n ) 


and explain how this could have been inferred from the theory of the complex variable. 

9. Find a series of sines and cosines that will represent e x in the range —ir<x< 7 i, and draw the 
graph of the sum of the series outside this range. Deduce that 

A i 2/ 1 1 1 \ 

coth7r = —I--I- 1 - 1 -- + H C. 1939 1 

tt tt\1 + 1 2 1 + 2 2 1 + 3 2 / K 

10. Find a solution of the integral equation 



g(x) cos axdx =/(a). 


where 


/(a) = l — £a 2 (0<a<l) 

= 0 (a> 1). 


(I.C. 1943.) 






Chapter 15 

THE FACTORIAL AND RELATED FUNCTIONS 


‘There are three hundred and seventy-two competent renderings of a single verse of one of the 
more cryptic Odes, and. it has been aptly claimed that even the appearance of a giraffe must be 
capable of some rational explanation. ernest b ram ah, The Moon of Much Gladness. 


15*01. We define the factorial function* by 

P 00 

zl = J uZe-^du, 


( 1 ) 


where 91(2) > — 1. Convergence is uniform in any region 9t(z) > — 1 + 8, where 8 > 0. 

It follows immediately by integration by parts that 

(z+1)! = (z+l)z!, ( 2 ) 

and hence for z a positive integer, since 

0! = 1, (3) 

z\ = 1.2.3 ....z. (4) 

Also z! is an analytic function because it has a derivative 


d f 00 

^ z! = J u z log u. e~ u du, 


(5) 


which converges uniformly in any closed region of z for 9t(z) > — 1. 

Now since z! is analytic we may expect that it will have an analytic continuation into 
the region where the integral (1) is not convergent. In fact if 9t(z) < — 1 we can choose 
an integer n greater than Sft( — z), and define 

{z+n)\ 


zl 


{z + n) {z + n— 1)... (z +1)* 


( 6 ) 


By the relation (2) this is true for 9t(z) > — 1. For 9t(z) < — 1 the expression on the right is 
the ratio of two analytic functions and therefore is an analytic function for all 3t(z) > — n, 
except for poles where z is a negative integer. For its meaning to be unique it must be inde¬ 
pendent of the choice of n, subject to di{n + z)> — 1; but if m is an integer greater than n, 


{z+m)\ 


j 


(z + n)! 


{z + m)\ 


(z + m) {z + m — 1)... (z+1)/ {z+n) {z+n— 1)... (z+1) {z+n)\{z+n+\)... {z+m) 


= 1 , 

(7) 

* This function is usually denoted by T(z + 1) in mathematical writings except when z is a positive 
integer. The T notation seems to have arisen through some idea that n! is defined in the first place only 
for n a positive integer, and that when we extend the domain of application we need a new notation. 
This is contrary to usual mathematical practice. Ordinarily if a definition is applicable only in a 
restricted domain, and another is found that is applicable in a wider domain and is equivalent to the 
first in the restricted domain, the second is taken as giving a meaning to the old term in the wider 
domain, sc* is originally defined only for x a positive integer and then extended in turn to rational, 
negative, irrational and complex numbers, sin a; is originally defined for real argument and extended 
to complex numbers by means of power series. Weierstrass’s theory of analytic continuation rests 
on the same principle. There is no reason why the factorial function should receive exceptional treat¬ 
ment in this respect. The extra 1 in the definition of T(z) is a minor but continual nuisance. As the 
factorial notation has been adopted in the British Association Tables , which are the fullest tables of 
the function, there seems to be no reason why the T notation should be perpetuated. Qauss’s II(z) 
is often used and is equivalent to zl; but it is liable to confusion with the general notation for a product. 
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and therefore the definition (6) is unique. It therefore must be the analytic continuation 
of z ! and can be taken as completing the definition. It follows at once that z ! has simple 
poles at all negative integral values of z, and no other singularities for finite z. 

We have already had (5-056) 

(-i)! = V 7r > ( 8 ) 


whence 


(£)’• = W 77- * (f) ! = f-W 77 ‘> —> 


( 9 ) 

= = (-!)! = !•!>, .... (10) 

15*02. The Beta function. For 9i(m), ^(n) > -1 put 3l(m) = fi, ^{n) = v; then 

m\n\ = j e~ x x m dx j e~ v y n dy & 

Jo Jo 

= Hm j j e~^ x+v) x m y n dxdy. (1) 

A-+°o Jo Jo 

We take the triangles (0,0) (A, 0) (0, A) and (A, 0) (0, A) (A, A) 
separately. For the former, put x+y = z and eliminate y; then 
z < A y and the range of x is from 0 to z , since y > 0. The integral 
over the triangle (0,0) (A, 0) (0,^4) is then 

f dz\ e~ e x in (z—x) n dx i 
Jo Jo 

and putting x = tz we have 

j A dzj 1 z m + n + 1 e-*t m { 1 - t) n dt = JV(1 - t) n dtj A z m + n +H~*dz 

-*(m + 7H-l)! J t m (\ — t) n dt. 

The integral over the triangle (A,0) {A, A) {0, A) of x m y n has modulus 

n A J^H+v+2. 

x^y v dxdy = 
o 



( 3 ) 


(^+1)^+1)* 


and that of e~ z x m y n has modulus 


Hence for SR(m) > — 1, Sft(n) > — 1 

•i 


g —A^/i+P+2 

< (/t+ 1 )(k+1)^° - 


rV(i _ •15* 

Jo (m + n +1)1 


( 4 ) 


( 5 ) 


The integral is usually denoted by B(m+l,n+ 1), but it is somewhat easier to mani¬ 
pulate in the present form. 

u 


In (6) put 
then 


l 


^ 1 +w’ 

u m du mini 


o (1+ w) TO+n+2 (m + w+1)!* 


( 6 ) 
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Again, for — 1 < 9i(z) < 1, putting m = z, n = —z, we have 

uf 


/; 


0 ( 1+^) 2 

But this integral is nz cosec ttz, by 12*125; hence 

z\(—z)\ = 


du = z\( — z)l. 


7TZ 


sm 7TZ 


1503 

(7) 

( 8 ) 


which is extended at once to all z other than positive or negative integers by con¬ 
tinuation. It is more convenient than the corresponding formula in the V notation, and 
easier to remember on account of the obvious check at z = 0. 

Again, for -1 < $(z), (2 ItT ) ) = J 0 ^ 1 ” ( 9 ) 

Put 2 1 — 1+5; the integral is 

2-23-1J (l —s^yds = 2 _2s J (1— s 2 ) z ds 

_ 2 - 23 - 1 J (l _ u) z w~ 1/2 du 

(* + £)!’ 

whence z ! (z + £)! = 2 -22!-1 7r 1/3 (2z +1)!, 

that is, z!(z —!)! = 2-^V 2 (2 z)!. (IR(z) > -|) (10) 

This is generalized by continuation. If z is a positive integer the proof is simple; 

(2z)! = (2z) (2z — 2)... 2. (2z— 1) (2z — 3)... 1 

= 2 2z z!(z —f)!/( —!)!. ( 11 ) 


15*03. Gauss’s definition. In (5) replace mbyz and t by u/n; then 
z\n\ 


(n + z + 

If n -> oo the last integral tends to 


| rn l »j\n 

=, n-*- 1 j w e ^l--j du. ($ft(z)>—1) 


/; 


u z e~ u du — z! 


since (1 - uln) n < e~ u for 0 < u 4, n and the M test is satisfied. 
Hence for fixed z and n tending to infinity 


(w + z+ 1)! 




> 1 , 


n\w 


irrespective of z. Since also (n + z + 1)/n 1 we can write this as 

(n + z)\ ? 
n\n z 


( 1 ) 

( 2 ) 

(3) 

(4) 


This is an equality for all positive nifz = -lorz = 0. 
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Substitute (4) in 15*01 (6); then since we can take n as large an integer as we like 

n\n z 


zl — lim --,, . . . 

»-*.« (z+1) (z + 2)... (z + n) 


(«(*)>-1)» 


(5) 


which provides a definition of zl independent of the integral, since n\ can be defined 
directly by the product 15*01 (4). (5) was taken by Gauss as the definition of the function 
and denoted by 11(2). We have not yet proved that (5) agrees with 15*01 (6) for $l(z) < — 1. 


But 


But 


n( 2 > -Um«|l + 2) ( 1 + |)...( I + 5)} \ 
log n<*) = lim [*(logf + log! + ... + log^) - J t log (l +£)] 

"J.l’ v,g! ^ i - ,08 ( 1+ s))* 

. m +1 . / 2\ /I 1 \ 2 

2log-log in = 2 7 r— + ... ) 

m \ m) \m 2m 2 / m 


^~2 m 2 


( 6 ) 


(7) 


( 8 ) 


The series for log 11(2) is therefore convergent for all 2 . Also ^ log 11(2) is a uniformly 

convergent series of analytic functions in any closed region excluding negative integers, 
and therefore 11(2) is analytic in any region that excludes negative integers. It must 
therefore be identical with zl according to our extended definition. We can therefore 
remove the restriction 91(2) > — 1 in (5). Gauss’s definition has the advantage that it is 
immediately intelligible for all 2 except the obvious poles; but in practice the function 
usually arises through the integral. 

We verify at once from (5) that 


n (z) 


= lim 


n B 


= 2 , 


11(2—1) n z_1 2 + n 

giving a direct proof of the inductive relation from Gauss’s definition. Also 

2 !( — 2 )! == lim n.n. 


(9) 


(1-2 2 )(2 2 -2 2 )...(m 2 -2 2 ) 


7 TZ 

sin nz 1 


( 10 ) 


in agreement with 15*02 (8). 

15*04. The digamma (F) and trigamma (F') functions. These are 

F{z) = ^-log 2 ! = lim (logn-—...-— 

w dz 6 n _>co \ z + 1 z+nj’ 




,+ 


( 2 + 1) 2 ( 2 + 2) 2 


+ ...+ 


{z + n)‘ 


; + 


( 1 ) 

( 2 ) 


J MP 
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466 Asymptotic formulae 15*05 

If in (1) we put z = 0 we have 

F(0) = lim (logn-1= -y, (3) 

n —>-oo \ “’/ 

where y is Euler’s constant, the numerical value of which is 0*577215665_ Hence 



(4) 


These functions, for 0<z<l*0 and z^ 10*0, are tabulated in the British Association 
Tables. A table covering the range 0<z<20*0 is given by E. Pairman.* The digamma 
function, also denoted by t/r(z+ 1), has important applications in statistics and in the 
solution of differential equations. Webster denotes it by T(z), which avoids the difficulty 
of dis t inguishing F from F in writing; but the latter does not seem to lead to confusion 
in practice. 

Note that F(z) = F(z + n) ---...-—-. (5) 

% “i~ Tl Z> “t~ 1 


15*05. Asymptotic formulae. The series for log z !, F(z) and F'(z) can be summed by 
the Euler-Maclaurin formula (9*08). Taking (2) we have for z not real and negative 


M 


dx 


n I 


+ ' 


0 (z + x ) 2 2z 2 (z+ 1) 2 (z + 2) 


;+... 


h 2 h 23A 


...-S 


* r m+1 P 2r+1 (x-m)(2r + 2)\ 


Hence F(z) = 1-^ + *! + %+... + ^+ 2 


0 J i 

J. 


m (Z + X)*+* 

m+1 (2 r + 2)^ 2r+1 ( x-m) 
(z + x) 2r +* 


dx. 


( 1 ) 


( 2 ) 


Z Z" Z“' ' * m= 0 J m 

For z real and positive all odd derivatives of (z + x )~ 2 have the same sign; hence the function 
then lies between the sums of r and r +1 terms of the series. The inequality is more difficult 
to state if z is complex, and obviously breaks down altogether if z is real and negative. 
Sirprilarly 


C/ \ 1 ^ B, 

^) = log ,+ £. 


r 

2rz 2r 


oo /• 
m=0 J i 


and by integration 


logz! = C+(z + $)\ogZ-Z + ^ + ] 


2z 3.4 z 3 


m+1 < J>2r+l(x-m) 
(:Z + X)*+* 


dx, 


(3) 

(4) 


again in the sense that the function on the left, for real positive z, lies between the sums 
of r and r + 1 terms of the series. We have to determine the constant C. Take z large and 
use 16*02 (10) in the form 


logz! + log(z-£)! = — 2z log 2 + £ log rr + log (2z)!. (5) 

Substituting from (4) and simplifying we find that the terms that do not tend to zero for 
large z give 

C = -|log2jr (6) 

Hence for large z 

log*! = i log2 w + (* + i )log*-* + i-^ + ^- i J 5? + II i ? -.... (7) 
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Stirling's formula 


This is Stirling*8 formula, and is in continual use for approximations to high factorials. 
It should not be regarded as an infinite series for any given z; if so regarded the terms 
oscillate infinitely. For we have seen (12*07) that for large r 


* 2r ~ r( 

and the general term of (7) is therefore approximately 

, 2 ( 21 - 2 )! 

V ) ( 2 77-)2r z 2r-r 


( 8 ) 


Thus for any z the terms are unbounded above and below. Nevertheless the decrease of 
the early terms is so rapid that the series is of the greatest use for computation. Even 
for z — 1 the terms of (7) up to z~ 5 and z -7 lead to 


1*83730 < log 2 tt < 1*83849. 

The correct value is 1*83788, which is almost midway between the two approximations, 
which themselves differ by only 0*00119; a truly remarkable result to be obtained on the 
hypothesis that 1 is a large number. Legendre computed z ! for small z by using Stirling’s 
series for large z and working back by successive division. The series was proved divergent 
by Bayes.* The first proof of the asymptotic property is due to Cauchy (1843). 

Stirling! actually gave the following series, with £ — z + i and z an integer: 

17 31 127 

logz! = £ log 27r + ^log^-^-2-Y2j+g-g^3-g2j260j B+ 1280680^ ^ 

The form (7) usually quoted as Stirling’s series was found by de Moivre in consequence of 
a letter from Stirling. The coefficients in (9) are somewhat smaller but not so easy to 
calculate. The ingenuity needed to obtain the series with the mathematical resources 
then available, not even the general definition of z! being known, is astonishing. The 
usual form is most accurately described as de Moivre’s form of Stirling’s series, since all 
the principles used in finding it were given by Stirling. His method consisted in expressing 
log n as a difference f{n+ 1 )—f(n). 

If we return to 9*08 we see that the formula can be written 


log z! = (z + £) log z — z+\ log 27r — r -— $ dt, (10) 

Jo z + r 

[£] meaning the integral part of t. The last term is the Bourguet-Stieltjes form of the 
remainder^—actually entirely due to Stieltjes; Bourguet found the corresponding expres¬ 
sion in a Fourier series, which Stieltjes saw to be that of t — [£] — £. 

15*06. It may be verified that 

/•Va* 1 ri 

cos m 0sin n 0d# = - £VsKn-i) n _ dt 

Jo 2 J 0 

_ 1 {i( m — 1 )}! {\( n — 1 )}! 

2 {%{m+n)}\ 

When m and n are positive integers, we can substitute the explicit formulae for the 
factorials and derive the usual elementary formulae for the integrals. 

* Phil. Trans. 53, 1763, 269-271. f Methodic# Differential's, 1730, Prop. 28. 

t J.des Math. (4), 5, 1889, 425-44. 
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468 Wallis’s formula for n 

15*07. Wallis’s formula for it . We have 


15 07-15 08 


/•i/air /*Va» f Vaw 

I sin 2m ~ 1 0d^> I sin 2m 0d(9 > sin 2m+1 0cZ0, 
Jo Jo Jo 


and therefore for integral m 


that is, 


(2m —2) (2m —4)... 2^ (2m— 1) (2m — 3)... 1 tt ^ 2m(2m — 2)... 2 

(2m — 1)(2m — 3)... 1 > 2m(2m-2)...2 2 > (2m+1)(2m-1)... 1 ’ 

2m(2m —2) 2 ... 2 2 1 ^ (2m) 2 (2m —2) 2 ... 2 2 

(2m — l) 2 (2m —3) 2 ... I 2> 2?r> (2m+1) (2m-1) 2 ... I 2 * 


2m 

2m +1 

t (2m) 2 (2m —2) 2 ... 2 2 

~ (2»+l) (2m-1)« ...I*' 


The ratio of the extreme members of this inequality is 
for large m. Hence 


, and therefore tends to 1 


This is Wallis’s formula, of historical interest as the first expression of \rt as the limit 
of an algebraic function. It can be used to obtain the constant in Stirling’s formula as 
follows, and even to give the first term of the formula. Inserting the factor 


(2m) 2 (2m — 2) 2 ... 2 2 

in both numerator and denominator we get 

\ir — lim 

If mis large 


2 4 m ( m !) 4 


m-*<o (2m + 1) (2m!) 2 * 

/*2m+i/a /1 \ 

log(2m)! — logm! = log(m+1) +...+log2m = \ogxdx + 01 —) 

Jm+Va \ m / 

= (2m + $) log (2m + £) - (m +1) log (m +1) - m + 0^. 
Use this to eliminate log (2m!) from (1) and drop small terms; then 

log \tt — lim j.2 logm! + 4m log 2 

— (4m+1) |log2m + -j^ + (2m+ 1) |logm + + 2m - log 2mj 

= lim 2{log m! — (m +1) log m + m — log 2}, 


( 1 ) 


(2) 


whence 


m! 

o V( 2 ^) rn m+1/a e- TO 


= 1. 


(3) 

(4) 


This method of course succeeds only when m is a positive integer. In view of the amazing 
ingenuity of many mathematical writers of the period, it is surprising that the first term 
was not found in this way until Stirling got the complete expansion. 

15*08. Dirichlet integrals. Take 

I = jjx m y n dxdy ( m,n> —1) (1) 

over the range of integration x^0, y^0, x + y ^a. Substituting y = z — x we have 


-a 


x m {z — x) n dzdx t 


(2) 




15*08 Dirichlet integrals 

since, if x exceeded z, y would be < 0. Put x = zt\ then 


= f° f 1 2 jn + n + 1 f m (l — t) n dzdt — 
Jo J 0 


a m+n+2 m ! n ! 
(m + n + 2)! * 


If we vary a while keeping m and n fixed 

" = gm+n+l ro!ro! 
da (m + n+ 1)! * 

Now take J 2 = JJ Jx m y n z v dxdydz 

over the range x^O, y>0, z > 0, x + y + z^a. Then x and y are restricted so that 

O^x + y^a-z; 

and integrating with regard to them we have 

,1 
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(3) 

(4) 

(5) 


by (3). In general if 


C a (a — z) m+n + 2 m ! n ! _ a mJrn + p+z m\n\p I 

2 Jo (m + n + 2)! Z z ( m + n+p + 3)! 


I = ff... Jx^xg 1 *... x? lr dx 1 dx 2 ... dx r 


where all > 0 and hx s ^a, 


( 2ra + r)! 


( 6 ) 

(7) 

( 8 ) 


If we take m = n = Oin (1), and a — 1, we have the area of a triangle whose corners are at 
(0,0), (1,0), (0,1); and the result is f. If we take m = ra = p = Oin (5), and a = 1, we 
have the volume of a tetrahedron whose vertices are at (0,0,0), (1,0,0), (0,1,0), (0,0,1), 
and the result is 

The integrals are easily generalized to evaluate 



J = JJ ... ff(Z.x l> )x™ 1 X 2 1 '... x^dx t dx 2 ... dx rt 

(9) 

under the same restrictions. For from (8) 



dl _ 

da (Hm + r-1)! 

(10) 

and 

dJ ,, x dl 


da ~^da' 

(11) 

Hence 

j . ft,.) 

Jo (2m + r— 1)! 

(12) 


A proof in greater detail is given by L. J. Mordell.* 

One of the most important applications of these integrals is in the theory of statistics, 
where we often want integrals of the form 

L = JJ... jf(Zx*)dx L dx 2 ...dx r 

subject to 2 a 2 . Put x 8 = 

* Edin. Math. Soc., Math. Notes 34, 1944, 16-17. 


(13) 

(14) 
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and take all £, ^ 0. Then 

L = //... //PE£) 2-^-v^-Va... &*dt x ...dt r 

= (15) 

This is the integral through a generalized quadrant; if we allow all the variables to have 
negative signs we must multiply by 2 r . 

In particular take f(s) unity; then we get the volume of a generalized r-dimensional 
sphere of radius a 1/2f 

L = ^—,a r . (16) 


(£r) ! 


For r = 1,2,3 this has the values 2a, na 2 , corresponding to the length of a line, the 
area of a circle, and the volume of a sphere. 

15*09. The exponential and related integrals. These are related to a case of the 
incomplete factorial function, which we shall denote by 


ei (z) = 


dt. 


( 1 ) 


The integral also converges if the upper limit is oo exp ict, and is independent of a, so long 
as -^7r<a^^7r. Nowin * 

jo (1 -Of, (2) 

the integrand is an integral function; hence by expansion and integration this integral 
is equal to 2 3 

(3) 


Z* Z' 
Z — ^r-x~,+ 


and therefore f — dt — [ f 

Jet J z t J t 


2.2! 3.3! 

z 1 — e~ l 


t 


dt 


-[log*— («-oi+oi—•)!• 


w 


The left side tends to a definite limit as Z->oo) hence the right side does the same. Then 


J & t l ~ C-l°gz + ‘ 




2.2! 3.3! 


where C is a constant. To identify it we have 

dz\ 

dz jo 

whence, if y is Euler’s constant, * « 

— y = F{ 0) = I e^log tdt. 


e~ l t z \ogtdt. 


(5) 

( 6 ) 
(7) 


Now 


and also 


JVlog^-f-e-loglJJ+JVf 

= e~ z \ogz + ei(z) 

= (7 - (1 - e - ®) log z + 0(z), 

= —y — | e -, log tdt. 

Jo 


( 8 ) 





15*09 Cosine and sine integrals 

Hence, making z tend to 0, C = —y, 


and 


ei(z) = -y-logz + z- 


*3 


r + • 


+ .... 


2.2! 3.3! 4.4! 

If z is real and negative, and we take the principal value, 

Pei( — x) = 

which is denoted by — Ei ( x) in published tables. 

An asymptotic expansion* for large z follows at once from 17*01, with n — 1. 

eiW = e-g-i + ^-|i + ... + ( — — y(r+l)!jV«— ‘dt. 

If z = iy, where y is real and positive, 

... x f 00 ,dt f 00 ,du C x cosu . f°°sin u 
ei(*j()= e->-=\ e-“- = —*»-» —'<*“ 

J iy " J v u J v u J v u 

= - y -log 2 ,-im + i^-3^ 1 +^ ] + ...) + (^2 1 - i ^| + ...). 


and also 

Hence 


J” 


00 cosu 


u 

sinw 


du = —y — log y 4- 


r 


V 4 


2.2! 4.4! 


+ .*., 


f 00 si 

J v 

The asymptotic formulae follow from 
ei (iy) ~ e~ iv 


_ , y z y 5 

- r du=i7r-y + ^ ¥r — ! 


+ .... 


. / i 1 

lv \ — + - 

\ y y 


1 i.2! 1.3! 


yr 


2 + yZ 


+ 


-)• 


whence 


/■"cosu , /l! 3! \ . /I 2! 4! \ 

J v u y \y y 3 y 5 / *\t/ a y 4 / 


The tabulated functions Ci (y) and Si (y) are 

' e0 cos u 


CS (y) = -J‘ 


U 


du, 


If in (1) we put 
we have 
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(9) 

( 10 ) 


( 11 ) 


( 12 ) 


(13) 

(14) 


(15) 

(16) 

(17) 

(18) 


„ , f co sinu . C v ainu 7 

Si <fa. 

z = ~log£, t = log U, 

ei (_log 0 = -/^, (19) 

which is denoted by — li (0, the principal value being taken if £is real and greater than 1. 
♦ This section should be read after the beginning of Chapter 17. 
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1509 


The integral 


A related integral 

sin xt 


r°° sinxt 

= Jo T+t 2 


dt 


cannot be evaluated directly by the method used for the corresponding integral in cos xt, 
since the integrand is an odd function of t. We can, however, write it as 


/*QO gtoj i 

Jo 1 + t Z 


dt, 


and replace the path by one from 0 to too, with a semicircle about t — i. The residue at 
this pole is e~ x j2i, and the semicircle therefore contributes nothing to the imaginary part 
of the integral. Then for x > 0 

f'ioo pixi fco p—xu 

1 = r -w .!=*<** 


= - p S?°-**{^n-uh) dn 

= _x P r e -x-*,<t + i i"V”- 

2 J-i » 2 J, v 

= — \er x F ei (— x) + \e x ei (x) 


1 2! 4! 

H—H—= + • • 
X X* x b 


EXAMPLES 

1. Find the positive number a such that the product 1.2 2 .3 3 ... n" is large or small in comparison 

with n a ' n> according as a' is less or greater than a. (M.T. 1943.) 

2. Prove that the residue of z ! at z = — n, where n is a positive integer, is (— l) n-1 /(n — 1)!. 

3. Prove that if n is a positive integer 

z\^z —!... {z — ——-j ! = (277-) 1/2(n_1) n _ ”* _1/a (nz) !. (Euler.) 

(Show that the ratio of the two sides is unaltered by changing z to z + 1; and that it is of the form 
1 + 0(1 jz) for z large.) 

4. Obtain the same result from Gauss’s definition. 

5. Prove that for 3?(z) > — 1 


/ [co e -t_ e —(n+l)< \ 

f(z) = lim jlogn— e~ tz —-— dt\ 

n—>-oo \ Jo 1 e / 


(Gauss.) 


(Cf. Frullani’s integrals.) 

6. If p- 1 denotes integration from 0 to x, prove from the Fourier-Mellin theorem that 

ei (x)H(x) = log(l+p).ff(a;) 


ei(x) + \ogx+y = x-^ + j--. 


and hence that 





7. Prove that 


/; 


u m du 
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0 (Au 2 +B) 1 ^ ra + n + 2 > 


8. Prove that 


r. 


{i(m + «)}! 2A 1 M m + 1 >B 1 M n + 1 > 

ar-n e -V(T'dcr = (n>l). 

2. ]ch nl * 


. (m> — 1 , ro> — 1) 


9. Prove that for positive real n 

1 


i n n 1 i »+l « 

-log--<0,-log->0, 

n n—1 n n 


and hen ce show directly that the limit defining Euler’s constant exists. 
10 - K 

where /(0) = 0, 0</i< 1, and f'[x) is continuous, prove that 

sin/t7 r d T x f(r})di} 

UX 7 T dx J o {x — 

11. Prove that ^ (i + ?Tl)- SB » ’ 

and derive corresponding expressions for F(x) and logo:!. 


(Abel.) 


(A. Lodge.) 


12. If I /(t) dr exists, a, /?2s 0, but not necessarily integers, and p~ a f(t)H(t) is defined by 

Jo 

p-«MH(t) =j t ^J(t-r)d^H(T)j 

show that P°f(t)H(t) =f(t)H(t) 

p-*p-f}f(t)H(t) =p-fip-*f(t)H(t) = P~ a ~Pf(t) H(t) 
and that for h > 0 p~ a e~ ph g{t) = e~ vh p~ a g(t) 

where g(t) is any integrable function zero for negative t. 

13. Using 15*05(7) and 15*02(11), derive 15*05(9). 

14. Using Wallis’s formula and 15*03(4), show that Stirling’s formula can be extended to com 
plex z for 91 ( 2 ) large and positive. 


15. If G(x) and S(x) are Fresnel’s integrals, defined for real x by 

C(x) + iS(x) = J* e u> dt 

prove that for large positive x 

1 In 

C(x) = - / - — Pcosa; 2 + Q sin a; 2 
1 In 

S(x) = - I - — Psina; 2 — Qcosx* 

1 / 1 1.3.5 \ x 1/1 1.3 1.3.5.7 \ 

•P(*)~2\2a; 3 2 3 x 7 + “')’ ^ ~ 2 \a; 2 2 x 6 + 2*x 9 / * 


where 




Chapter 16 


SOLUTION OF LINEAR DIFFERENTIAL EQUATIONS 
OF THE SECOND ORDER 

A merry road, a mazy road, and such as we did tread 
The night we went to Birmingham by way of Beachy Head. 

G. K. Chesterton, The Flying Inn 

16*01. When the coefficients in a differential equation are variable the chief methods 
of solution are as follows: 

(1) Direct numerical solution (Chapter 9). This is often laborious, but in many cases 
it is the only method. 

(2) Solution by power series. 

(3) Solution by substitution of definite or contour integrals. 

(4) Asymptotic solutions (Chapter 17). These can be obtained by several methods. 
Direct transformation of the differential equation will often yield solutions in the form 
of asymptotic series; also a solution as a definite or contour integral can be approximated 
to by the method of steepest descents. 


16*02. Singular points of a differential equation. Any second order linear equation 
can be expressed in the form ^ 


^y = f Lv ^\ 

dx* 1 I’ y ’ dx )' 


If y and dyjdx are given at x = x 0 , the differential equation in general determines d^yjdx 2 
at x = x Q . Differentiating the differential equation we can determine d?yjdx z at x = x 0 , 
and so on; the terms of a Taylor series for y can thus be found in turn, and if the series has 
a non-zero radius of convergence a solution exists. If y and dy[dx can take any pair of 
values at x = x 0 without making d 2 yfdx 2 infinite we call x 0 an ordinary point of the equation; 
if not, we call it a singular point. For instance, if 


d*y 
dx 2 


= -y> 


and y = y 0 , dyjdx = y x at x = x 0 , there is a solution 


y = y 0 cos (x - z 0 ) + y x sin (x - x 0 ) 

whatever x 0 , y 0 , y x may be; hence all values of x are ordinary points of t his differential 
equation. But even with the first order equation 


we cannot assign y arbitrarily at x = 0; for if we take y anything but 0 at x = 0, dyjdx 
will be infinite and we cannot form the Taylor series. Similarly for 
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if y is not 0 at x = 0 either dyjdx or d*yjdx 2 is infinite and we cannot form the Taylor series. 
The value x = 0 is a singular point of these two equations. 

An important property of linear equations is that their singular points are fixed. With 
such a simple equation as dy l 

dx 1 —y 2> 

dyjdx is infinite where y = ± 1, and this is not at the same value of x, irrespective 
of the value of y at x = 0. In fact the solution is y — \y z = x + ct, and y = 1 where 
x = f — a = f _ y(0) + Hy( 0 )] 3 - It is this variability of the position of the singularities that 
leads to most of the additional difficulty of the general theory of non-linear equations. 

16*03. Existence of solutions about ordinary points. Consider the equation 

%+tw d £ +g{x)y - 0> (1) 


where/(a;) and g(x) are analytic functions of x (which we may take to be a complex variable 
for extra generality) within a region including x — 0. When x = 0 let y = y Q , dyjdx = y x . 
For them to be assignable arbitrarily/^) and g{x) must be bounded near x = 0. Define 


<$(*)= (2) 

Perform the operation Q on the differential equation, assuming it possible, and integrate 
the second term by parts. Then 

%-yi+y}(x)-yJW-QW\*)}+QW)y) = o, (3) 

which we can write ^ = J/i+J/o/(0) — yf(%) + Q{H x )y}- ( 4 ) 

Integrating again we have 

y = 2/o+2/i*+W (°) x - Q{yf( x )}+Q 2 {M x ) y}- ( 5 ) 

Put y 0 +yi x +yof(°) x = u i’ ( 6 ) 

- Q{f( x ) ^ 1 } + Q 2 {H X ) ^ 1 } = u 2y ( 7 ) 

- Q{f( x ) u r) + Q 2 {h( x ) u r) = U r+V ( 8 ) 

and take a straight path from 0 to x. Suppose that on the path | f(x) | < S, | h(x) | < T, 
| u x | < C. Now if | <j>{x) | <A\x \ r on the path 

| Q<f>(x) | < J*^ ' A\x\ r \dx\<^-^\xY+ x \ (9) 

also there is a quantity U such that S + \T | x \ < XJ for all points of the path. Then 

Kl = |-W(*)«i}+«W«JI<cs|*l+fjOT|*l 2 <cP|*|, (io) 


I u, I < cjiSU I * 1* + ^ TV I a; |») CU* \ x |», 




( 11 ) 


and in general 


( 12 ) 
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The series u x + u 2 + u z +..(13) 

is therefore absolutely convergent; and if the inequalities hold for \x\<R and arg* 
within a given range, the series is also uniformly convergent by the M test. The separate 
terms u r are analytic functions of x, since u x is an analytic function and by definition 
each later term is the integral of an analytic function. But a uniformly convergent series 
of analytic functions is analytic; hence the series represents an analytic function. The 
proof that it satisfies the differential equation and the conditions imposed at x = 0 is 
straightforward. 

The existence of a solution is therefore proved for straight paths of integration. By 
Cauchy’s theorem it will be correct for any other path with the same termini provided 
that the new path can be deformed into a straight line without passing over any singu¬ 
larity of the functions integrated. But the only possible singularities of the functions u r 
are those of f(x) and g(x). The solution is therefore analytic over the whole plane except 
possibly at the singular points of the equation. 

The method is substantially that of Picard, foreshadowed in unpublished work of 
Cauchy and in the paper of Caque already mentioned in Chapter 7.* 

The process can be extended to give continuations for the solution around the singular 
points. Suppose for instance that there is a singularity at x = 1, and no other. Then we 
could proceed from * = 0toa;=2 + i, transform to 2 + i as a new origin, and infer a value 
of y for x = 2, which cannot be reached by a straight path from x = 0. But we should not 
necessarily get the same value in this way as if we proceeded by way of x = 2 — i. It is 
possible, in fact, for f(x) and g(x) to have poles at a point and therefore to be single-valued 
near it, but y and its first derivative cannot be assigned arbitrarily and the admissible 
solutions will in general have branch points. 


16*04. Power series solutions. This argument proves the existence of a solution. 
The method is seldom convenient for obtaining it explicitly, though it can sometimes be 
usefully combined with numerical integration. It does, however, show that the solution 
is expressible by a power series within a circle reaching to the nearest singularity of the 
equation. Since f(x) and g(x) are also so expressible we can substitute directly in the 
differential equation and equate coefficients. Then the coefficients in the solution can 
be found in turn. 

Thus if we take the equation satisfied by the Airy integral 

d % y 


dx 2 


= ry. 


( 1 ) 


we assume y = a Q +a±x + a 2 x 2 + a z x z .... 

Then by substitution 

2a 2 + 3.2a s a; + 4.3a 4 ;z 2 + r(r— l)a r x r ~ % +... = a 0 x + a 1 x 2 +... + a r _ 3 af _2 +...» 


and 

Then 


a =0, a 3 = 


2.3’ 


a. = 


3.4’ 


a, — 


— r-3 


y 


= a 0 ^ + 


r(r— 1)’ 


x a 


; + 


2.3 2.3.5.6 


+ 


\ / x 4, x 7 \ 

•T“ i r + o + ox7 + -) 


( 2 ) 


This is obviously an integral function, as would be expected since the differential equation 
has no singular point. 

* For historical references see Encycl. d. math. Wise, n, li, 198-200. 
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Take now Legendre's equation 

(l-z 2 )^|-2 x^ + n(n +1 )y = 0. 


dx 
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(3) 


The singular points are at x = ±1. Try 

y = a Q +a x x + ... + a r af + .... 

The constant terms and linear terms give 

2.1a a + tt(*H-l)a 0 = 0, 3.2o 3 + (n + 2)(w-l)a 1 = 0, 

and in general (r+1)(r + 2)a r+a = —(n — r)(n + r+ l)a r . 


(4) 


Hence 

y 


a^l- 


n{n+ 1) 
2 ! 


2 . (n—2)w(n+ l)(n-f 3 ) ^ a 


x d + 


4! 


(n — 4)'(w — 2)n{n+ l)(w + 3)(w+5) 6 
6 ! X 


+ , 


■ a x — 


(n—l)(n + 2)^ (n— 3)(w— 1)(w + 2)(to + 4) ^, 


3! 


5! 




(5) 


We see from (4) that the radii of convergence of both series are in general 1. But if n is 
an even positive integer the series of even powers terminates and reduces to a poly¬ 
nomial; if it is an odd positive integer the odd series reduces to a polynomial. In either 
case the other solution is still an infinite series and must have a singularity for J x J = 1, 
but there will not be even one solution analytic at both + 1 and 1 unless n is an integer. 

In both these examples there is a two-term recurrence relation between the coefficients. 
This is fortunately true of many of the solutions of the differential equations of physics, 
and the series solutions are then easy. If the recurrence relation involves three coefficients 
the solution is difficult, and if more it is practically impossible to obtain more than a few 
terms explicitly. 

16*05. Solution near an isolated singularity. Let f{x) and g{x) have a singularity 
at * = 0 but be single-valued in S, a region about 0, and have no other singularity in S. 
Let x — a be an ordinary point in this region. Then the differential equation has two 
independent analytic solutions in a region about a, the first terms being 1 and x — a 
respectively; denote these by Y x (x) and Y 2 {x). They have analytic continuations through¬ 
out S except at 0; let continuation be carried out along a closed path about 0, returning 
to a. Let the continuations of Y 1 and Y 2 be Z v Z 2 . These will not in general be equal to 
Y x , Y 2 for given x, because x = 0 may be a branch point of the solutions. But Z lt Z 2 will 
be solutions of the differential equation and must therefore be linear functions of Y x , T 2 ; 


thus for all a; in $ 


Z x — On Y x + a X 2 Y% 
Z 2 — ®21 -^1 " l ” ®22 -^2 


r} 


( 1 ) 


The matrix a ik is non-singular. For if it were singular, there would be an identically zero 
linear form aZ x +fiZ 2 with a, fi not both zero; on reversing the process of continuation 
we could then show that aY x +/5Y 2 is identically zero, which is impossible. Hence a ik has 
non-zero eigenvalues Ai, A 2 . 

First let these eigenvalues be different. Then we can find two linear combinations of 
y i a nd 7 2 , say W x and W 2 , such that their continuations are A^ and X 2 W 2 . But the con¬ 
tinuation of X s is x 8 exp 2ms. Hence if and s 2 are such that 

27t£si = log Ai, 2nis 2 = log A 2 , 


( 2 ) 
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the functions x~ 8 ^W x , x~ s *W 2 (3) 

are single-valued in S. They have therefore convergent Laurent expansions in S and 
therefore within a circle about 0 extending to the nearest singularity of f(x) or g(x). 

If the eigenvalues of a ik are equal, it will be possible to reduce a ik to a triangular matrix 
but not in general to a diagonal one; then we can find a W x such that its continuation is 
A W Xi but for any other linear combination W 2 the continuation will be A W 2 + yW x . Then 
there will be a solution W x such that x~ 8 W x is single-valued in S. 

For the second solution, if 

2nis = log A, 2mt = ~ , (4) 

A 

W 2 — W x t log® will have continuation \{W 2 — W x t\ogx), and therefore there is in general 
a second solution W 2 such that 

x~ 8 (W 2 — W x t\ogx) (5) 

has a convergent Laurent expansion with the same region of validity. But W 2 may be 
identically equal to W x t\ogx. We could, for instance, have a differential equation with 
solutions x, x log x. In any case, however, the solutions can be written as one of the pairs 

x?'z x , x?*z 2 \ x s z 1) X s (z 2 — z x t log x); x?z Xi x?z x \ogx\ (6) 

where z x , z 2 are single-valued and analytic in S except possibly at x = 0, and therefore have 
convergent Laurent expansions in S valid in a circle about 0 extending to the nearest 
singularity of f(x) or g(x). The third case may be regarded as a particular case of the second 
when z 2 — 0, t = — 1. 

It may happen that even if the eigenvalues of a ik are equal, it can be reduced to diagonal 
form. In this case /i — 0, and there are solutions W x , W 2 such that x^Wy, x~ 8 W 2 are single¬ 
valued in the neighbourhood. 

The condition that f(x), g(x) are single-valued in $ is necessary to this argument. For 
if either of them did not return to its original value when x described a circle about 0, 
the relations (1) would not be true. 

This argument shows that solutions of the types indicated exist near a pole or an 
isolated essential singularity of f{x) or g(x). It leaves open the possibility that z x and z 2 
may themselves have singularities at x — 0, which may be essential singularities even 
if f(x) and g(x) have only poles. 

16*051. Regular singularities. The Laurent expansions of the functions z x ,z 2 in 
16-05 may each contain only a finite number of negative powers; if so, each function has 
a pole or an ordinary point at x = 0. If this is so the singularity of the differential equation 
is said to be regular. Then, since an alteration of s x , s 2 or s in 16-05 by an integer does not 
affect the corresponding A, we can rewrite the solutions in one of the forms 

Vl = x 8 ^, y 2 = a^z 2 (1) 

or y x = x s z 1 , y 2 = x s + c z 2 +ty x logx, (2) 

where z 1 , z 2 are now required to be analytic and not zero at x = 0. In (2) c must be an 
integer. In (1), if s x = s 2 , we can subtract a multiple of y x from y 2 so as to cancel the first 
term. The remainder will not vanish identically because y 2 is not proportional to y x , and 
will be a solution independent of y x starting with a power different from s x . Hence there is 
no loss of generality in taking s x 4= s 2 . In (2), similarly, if c = 0 we can subtract a multiple 
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of y x to cancel the first term in x 8 z 2 ; thus there is a solution with c a positive integer. In 
this case the whole of z 2 may disappear and we are left with the special case where 
y 2 — y x log x. Thus in (2) we can take c #= 0 unless z 2 = 0. 

If f(x) or g(x) has an isolated singularity, not a branch-point, at x = 0, a necessary and 
sufficient condition for it to be a regular singularity of y" +f(x) y' + g(x) y = 0 is that xf(x) and 
x 2 g(x) shall be analytic at x = 0. We show first that the condition is necessary. Since 
y = A x y x + A 2 y 2 satisfies the differential equation for all constant A x , A 2 , we must have 


fM= yiri-tfy* . g (T ) _ vM-viy* 

n ViVi-vM Viy'l-yiy* 

and also, since y x is a solution, 

g[x) _ 

Vi 

From (1) we find by direct substitution 

nx) = -h±^=l +0 (iy, = 

From (2), if c is positive or if z 2 = 0, 

1 <? 2 

/(*)-— V - +0(iy, 9 (x) = - 2 +o 

JC Jj 


From (2), if c is negative, 
/(*) = “ 


25 + c — 1 


+ 0(1); g{x) = 


s(s+c) 


T 
{ xj 

+ Ol- 1. 


(3) 

( 4 ) 

(5) 

( 6 ) 
(?) 


In all cases f(x) and g(x) contain no logarithm. Hence the condition stated is necessary. 
Conversely, let xf(x), x 2 g(x) be analytic at x = 0, and have no singularity for 

I X I < | x 0 I = r 0‘ 

Make a cut from 0 to — oo sgn x 0 . Then any point x = re <a (r, a real) within the circle 
| a; | = r 0 and not on the cut can be reached from x 0 by a straight path from x 0 tox x = r 0 e ict , 
and then another from r 0 e ia to x, and also by a straight path from x Q to x directly, and the 
two approaches, by the argument of 16-03, will give the same result. Now y and dy/dx are 
bounded on | x | = r 0 . (For if arg x— arg x Q > \rt we can proceed from x 0 to x Q e ini and 
then to x, by straight paths, on which r ^ rjff 2.) Define 

Q<l>(r) = f ${p)dp. (8) 

J u 

Since r<r 0 this is negative for positive 0, but Q 2 <fi(r) is then positive. Now if | f(x) | ^ A/r, 
| g(x) | < B j r 2 for r^r 0 , we can apply the method of 16-03 to the integration from r 0 tor; and 
the moduli of the successive terms are < those of the solution by the same method of 


d 2 z Adz B 
dr 2 + r dr r 2 Z ~ ! 


(9) 


given that at x — x x , z and dzjdr have the values of y and — | dyjdr \ . But this equation 
has the solution 

z—C^+Ddi, (10) 

where t x , t 2 are the roots of t(t — 1) + At — B = 0. (11) 

Since B is positive the roots are real and cannot be equal; hence the solution of (9) always 
has the form (10), and if t x is the smaller root r~+ z is bounded* for 0 < r ^ r 0 . But | z \ ^ | y \ , 
* Of course it is not analytic, being defined only for a special path. 
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and C, D are clearly bounded for variations of a (including on the cut, from whichever 
side we approach it). Hence x^y is bounded for | x | < r 0 . But this implies that the functions 
z lt z 2 of (1), (2) have not essential singularities at 0. Hence the differential equation has 
a regular singularity at x = 0. 

If y = Ax + Be llx , 


d 2 V 2x (dy 
dx 2 ic 3 (a;+l)\ 




so that both xf(x) and x 2 g(x) are non-analytic at x = 0. If 

y = Ae 1/x + Be~ llx , 

dx 2 xdx x 4, * 

so that xf(x) is analytic but x 2 g(x) is not. 

16*052. Singularities at infinity. If in 


we put x = 1 /£, we get 


2* +nx) t +gix)y=0 


-f + (2£ 3 - Pf(x)} % + g(x) y = 0. 


dp ' 

We call infinity an ordinary point of (1) if the functions 


2 £-/(*) 


= 2 x — x 2 f(x), 


9(x) 


rr= ntAl 


x 4 g(x) 


( 1 ) 

( 2 ) 

(3) 


P - ’ £ 

are analytic at £ = 0, that is, at x — oo. We call it a regular singularity if they are respec¬ 
tively 0(1/0 = 0(x) and 0(l/£ 2 ) = 0(x 2 ). 

It is impossible for all points, including infinity, to be ordinary points of (1). For then 
f(x) and g(x) must be analytic over the whole plane and therefore integral functions, and 
for | x | large 

2 x — x 2 f(x) = 0(1), x 4 g(x) — 0(1). 

These functions are bounded over the whole plane and therefore are constants, say a 
and b. Therefore 

/./x a 2 . . b 

= g(*) = -i 

and x = 0 is not an ordinary point of (1) even if a = 6 = 0. 

It is possible for all points except 0 and oo to be ordinary points of (1), and for 0 and oo 
to be regular singularities. In these conditions for | x | small 




and for | a; | large, 2 x-x 2 f(x) = 0(x), x i g(x) = 0(x 2 ). 

Hence xf(x) and x 2 g(x) are bounded over the whole plane, and are constants. Then (1) 
reduces to the form 

d 2 y ady b 
dx 2 ^xdx^x 2U 

which can be solved in terms of elementary functions. 
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16*06. Solutions near a regular singularity. In 16 051 (1) (2) the functions z x , z 2 are 
analytic in a neighbourhood of 0. Hence all the terms of the differential equation can be 
expressed by convergent power series, and we can proceed by substituting power series 
for y and equating coefficients. We rewrite the differential equation in the form 


Dy = x 2 ^+(Po+Pi x +'--)x^+(q 0 +qix+-->)y = o. 


(1) 

(2) 


Put y = x?(a,Q + a x x + ...) (a Q #= 0) 

and equate powers of 3f +r . We find 

{(s + r) (s + r- 1) +p 0 (s + r) + q 0 }a r = terms in a 0 to a r _ x . (3) 

For r = 0, we have, since a 0 #= 0, 

^(s-lJ+PoS + g'o = 0 (4) 

which is called the indicial equation. Let its roots be s x , s 2 . For each root, provided the 
coefficient in (3) never vanishes for r > 0, we can determine the a r in turn, a 0 being arbitrary. 
The series will converge within a circle extending to the nearest singularity of the differential 
equation other than x = 0*, and a linear combination of the two solutions will give the 
general solution. There is an obvious exceptional case if s x = s 2 . There may be another if 
they differ by an integer; if they are a and a,-Jc(Jc> 0), a series solution is obtained as usual 
for the larger root. But for the smaller the coefficient of a r in (3) will vanish when r = Jc 
and with an arbitrary a 0 , a k and higher terms will in general be infinite. The series can then 
be rendered finite only by taking a k finite and all earlier coefficients zero. But then the 
series reduces to that given by the root a, and we have still found only one solution. 

It may happen, however, that for an arbitrary a 0 the right side of the equation for a k 
also vanishes. Then the equation holds for any value of a*, and the solution starting with 


is arbitrary by a multiple of the solution that starts with x*. We therefore get two 
solutions in power series in this case. This corresponds to the case where a ik of 16-05 is 
reducible to diagonal form in spite of having equal eigenvalues. 

It remains to examine the nature of the other solution in the cases where only one 
series solution exists. Denote this solution by 

w{x) = x«{\+a x x +...), 

and put y — u)(x) z. (6) 

The terms in z cancel because w(x) is a solution of the equation, and 


2 w' 


Po 


— - -Pl~P2%- 

W X 


(7) 


whence 


log 2 ' = C — 21ogw— p 0 loga;— p x x — \p 2 x 2 —.. 
A 


z = 




xVow^W’ 
x-Vow- 2 <f>(x)dx, 


( 8 ) 

where G and A are arbitrary, and <j>{x) is analytic, and equal to 1 for x = 0. The first term 
in the expansion of x~Po w ~ 2 is x~p*- 2 *. But if the other root of the indicial equation is a-Jc, 

2cc — Jc = —p 0 +l, (gj 

* For a direct proof see Whittaker and Watson, Modem Analysis, 1915, p. 193. 
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and —p 0 — 2a = — & — 1. If k is not a positive integer or zero we therefore get a series of 
powers on integration, starting with x~ k . If k — 0, 

z = A\ -(1 + r x x + r 2 x*+...)dx 

= A (log x + r x x + !»•¥ + ...) + constant, (10) 

and y = {A (log x + r x x + \r 2 x % + ...) + B} w{x) (11) 

so that the solution is a case of 16-051 (2) with c > 0 unless all of r x , r 2 ,... are zero, when it 
is a case of 16-051 (2) with z 2 = 0. 

If k is an integer > 0, 

z = A J x~ k -\l + r x x + r 2 x* +...)dx 

= • +r ‘ loga,+ -)- <12) 
The second solution therefore contains a part w(x) log* unless r k = 0. The rest is a power 
series beginning with a term in * a-fc , and a — k is the other root of the indicial equation. 
The solution is then of the form of 16-051 (2) with y x = w(x) and c = -k with k positive. 
The forms of the indicial equation given by 16-051 (5) (6) (7) lead respectively to the 
pairs of indices s lt s 2 ; 8,8; 8,8 + c. Hence the correspondence of the types of solution in the 
two methods of approach is as follows. 

16-051 (1). Roots of indicial equation different; if they differ by an integer, 16-06 (12) 
with r k = 0. 

16-051 (2) with c> 0 or with z 2 = 0. Roots of indicial equation equal. One solution 
contains a logarithm. z 2 — 0 corresponds to all r x ,r 2 , ..., of 16-06(11) being zero. 

16-051 (2) with c < 0. Roots of indicial equation differ by an integer; 16-06 (12) with 
r fc 4=0. One solution contains a logarithm. 

The roots of the indicial equation are usually real. For if 
8 = u + iv 

x 8 = exp {(w + iv) log *} = x u {cos {v log *) + i sin (v log *)} 
and the complicated behaviour of the last factor near * = 0 is usually forbidden by some 
physical consideration. 

The importance of the case where the roots differ by an integer and a logarithm does not 
arise far exceeds what might be expected from the fact that it apparently implies two 
coincidences. The reason is that it is only in the case where s x , s 2 are both integers and 
a logarithm does not arise that both solutions can be free from a branch-point at * = 0. 

16*07. The second solution may be found explicitly, once we know its form, by either 
of two methods. We can assume 

y = w(x) log z-t- S 6 r **~ fc+r , (13) 

r=0 

and determine the coefficients by substitution; or we can use the method of Frobenius. 
This consists in substituting in the differential equation a series 

y = x s (l+a 1 x + a 2 x 2 +...), (14) 

and equating coefficients as before, except for the lowest power of x. Then for any s, such 
that no s + r for integral r ^ 0 satisfies the indicial equation, all the coefficients are deter- 





16*07 Method of Frobenius 483 

mined in terms of a 0 and 8 exactly as before. But y no longer satisfies the differential 
equation; in fact 

Dy = {s( s -l) + 2 , 0 s + g 0 }a* (15) 

Now if the roots are different and do not differ by an integer, and we make 8 tend to either 
of them, the coefficient tends to 0 and y therefore tends to a solution of the differential 
equation. We thus recover the previous solutions. 

If the roots are both equal to a, the coefficient is (3 — a) 2 . Hence we have when s -> a both 


Dy-> 0, ~Dy-*-0. 

But 8 does not appear explicitly in D\ hence 


(16) 




(17) 


fly 

and therefore both y and tend to solutions of the differential equation. Thus we get 
two independent solutions. 

If the roots are a and a-k, the coefficient is (s-a) (s-a + k); 


Dy - {s-oc)(8-a.+k)x?. (18) 

When 8->a, y tends to the previous solution. But ^Dy does not tend to 0 when 8 tends 

to either root. This can be remedied by multiplying by 8 — a -f k, and we get a second 
solution 

(i9) 

.The factor 8 — oc+k cancels the vanishing factor in the denominators when r^k and in 
general we get an infinite series. 

It appears that in this case we could get a third solution 

>»)L. 

by the same method. But no denominator in y tends to 0 when s-*a, and this expression 
is simply (y) 8m . a again, so that we recover the first solution. 

In spite of its apparent simplicity the method of Frobenius is seldom used, for the 
following reason. In general we require the solution of a differential equation to satisfy 
certain conditions such as being single-valued and continuous in a region, or tending to 
zero at infinity. The solution containing a logarithm usually fails to satisfy the former 
condition and therefore we often need only the solution with the higher index, which is 
a straightforward series. On the other hand even if both solutions are power series, 
they usually tend to i nfini ty with x and to get one that tends to zero we must take a 
combination of them in a special ratio. The solution given by the method of Frobenius 
is not in general this combination, and it can be discovered only by other methods, which 
lead to determinations of the coefficients on the way. We shall find instances of such 
methods in relation to Bessel and Legendre functions and the confluent hypergeometric 
function. 
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484 Legendre's equation 

In Legendre’s equation put x = 1 + z. Then 

(2z + z*)^ + 2(l + z)^-n(n+l)y = 0. 


The indicial equation is 2s(s — 1) + 2$ = 0, 

and both roots are zero. Using the method of Frobenius, put 


y c = z c (l + a 1 z+...+a r z r +...). 

The lowest power of z merely repeats the indicial equation. The terms in z r give 


and 


2(r+ 1 + c) 2 a r+1 = -{(r + c)(r + \ + c)-n{n+l)}a r 
= — (r + c—ri){r + c + n + \)a, r 

J (c—n)(c+n + l) z (c — n)(c—n+l)(c+n + l)(c+n + 2)/z\ 2 
Vc = (c+1) 2 2 + (c + l) 2 (c + 2) 2 \2/ 


When c->0 we get a solution 

.. n(n+l)z (n-l)n(n+l)(n + 2) 

W\Z) 1 -1 jg 2 l a 2 a 



which could have been found directly. Its radius of convergence is 2. This solution is 
denoted by P n (x). If n is an integer the series terminates and is therefore a multiple of 
the terminating series found already (16-04). 

The second solution can be taken as dyjdc, when c->0. We have 

0 0 

—z c + r = z r ^~ c {exp (c log z)} -> z r log z, 

and the second solution is 


P n (x) log z 


+ 


/ 1 1 2 \ 
\ —^"*"^ + 1 1 / 


(- n) {n+ 1) 

p 

(—ri)(—n + l)(w + l) {n + 2) 

1 2 .2 2 




+ 


1 


+ ■ 


n+1 n+\ n+2 


When n is a positive integer the terminating series is also analytic at x — — 1, but there 
will be a second solution containing log (x +1). It can be shown that the second solution 
can be taken as 

?»(*) = \ p n{*) log 


where consists of the positive and zero powers of a; in the expansion of - P n (x) log 


x+ 1 
x— 1 


in descending powers of x. 

The properties of the second solution indicate at once a way of excluding it from a large 
class of physical problems without actually needing to evaluate it. The most important 
applications of the equation concern a sphere, x being the sine of the latitude. In such 
problems we usually know that the solution or its derivative is finite at the poles. But 
when both indices at the poles are zero the second solution obviously does not satisfy 
either condition. Hence the only admissible solution is the first, and this must be analytic 
at both + 1. The equation has no other singularity, and therefore the admissible solution 
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must be an integral function. Of the solutions found in 16-04 the one in powers of a; that 
terminates is therefore an admissible solution; the infinite series with radius of con¬ 
vergence 1 is not. Consequently we can assert immediately that the terminating solution 
is the one required. Even when the differential equation has other singularities we can 
often identify the physically important solution by mere inspection of the indicia! 
equation and the radii of convergence. 


16*08. Three-term recurrence relations. The following example from tidal 
theory illustrates both this principle and a method of obtaining solutions when the 
recurrence relation involves three coefficients. In the free symmetrical oscillations of 
water on a rotating globe, £, the elevation of the surface, satisfies the equation 


A 

d[i 



= 0 , 


where /i is the sine of the latitude, a positive constant depending on the depth, the rate 
of rotation, and the radius of the earth, and/is proportional to the speed of the oscillation 

to be determined. £ and p—p^z must be finite for all latitudes. The equation has 
singularities at p = ± 1, p = ±f. Suppose that £ is expanded in powers of 1 —p, starting 
with (1 -p) n . Then the term of lowest degree in (1 -p) is~^ (1 -p) n ~\ and the indicial 

equation is n 2 = 0. Consequently one solution contains terms in log(l-/i) and is in¬ 
admissible, since it would make d^dp infi n ite at the north pole. Similarly one solution 
would contain terms in log (1 +p) and be inadmissible at the south pole. The determination 
of the periods then reduces to finding values of / that give solutions finite at both poles. 

It is easily shown that if/ 4= 0 the roots of the indicial equation at p = ±f are 0 and 2; 
if / = 0 they are 0 and 3. On direct examination, however, it is found that the second 
solution near p =/does not involve a logarithm, and both solutions are therefore analytic 
at /* = /• Hence if we obtain a solution in powers of p it must be analytic at all the sin¬ 
gularities of the differential equation and therefore be an integral function. Hence the 
ratio of the coefficients of consecutive terms tends to 0. If it tended to ± 1 we should know 
at once that the solution becomes logarithmically infinite at the poles. It is not even 
strictly necessary to examine the singularities at p = +/. For if | /1 < 1 and the solution 
was not analytic at ±f the radius of convergence would be | / | < 1, and if both solutions 
were found to have radii of convergence > 1 it would follow at once that they are analvtic 
at p = ±/. J 

The method of solution adopted by Laplace and many later writers is to take 


whence by integration 


1 d£ 

p? _ p ftp ~ a 0 ~f~ Mlf 1 + ^ 2 /^ +•••» 


£ = A K -2 —f 2 a r )p r+1 . 


( 1 ) 

( 2 ) 


and also 
Hence 


1 — ju, 2 d£ 00 


00 / at) 1 

- ai - - a r-z)n r - x +p\A -pa 0 /i -\f 2 a xl i 2 +^ 2 ( a r -2 -f\) /* r+1 } = 0. 


(3) 

(4) 
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Here we can equate coefficients; the terms in fi r+1 give for r ^ 2 

— (r + 2) (& r+ 2 ~ a r) + r +l ( a r-2 ~P a r ) = 


and if 


_r±2 _ 7y 
_ XY r> 

a„ 


N r = 1- 


PP 


P 


(r +1) (r + 2) (r +1) (r + 2) N r _ 2 


16*09 

(5) 

(6) 
(7) 


We have a three-term relation between coefficients. Now if r is large and j^._ 2 not small 
of order r~ 2 , N r will be nearly 1, N r+i still more nearly 1, and so on. Hence the radius of 
convergence of the series (1), and therefore of the series (2), will be 1, which is what we 
must avoid. The only escape is that N r _ 2 must be small of order r -2 ; but then the solution 
is an integral function. Incidentally the possibility of a singularity at fi — ± /is disposed of. 
The question that remains is whether we can choose N r so that all N r determined by succes¬ 
sive applications of (7) will be small of order r~ 2 . Now (7) can be rewritten in the form 


K 


P 

(r+ 1) (r + 2) 

, ep * 

1_ tH - ft 

(r+l)(r + 2) r 


( 8 ) 


and N r _ 2 is of the order required if N r is. We can therefore express N r _ z as a convergent 
continued fraction; the information that £ is an integral function is equivalent to the 
statement that all N r are small enough for the continued fractions expressing the ratios 
of successive coefficients to converge. 

The solution is either an odd or an even function of /i. If we take the even solution and 
pick out the terms in fi 2 from (4) we have 

- 3(a 3 - a ± ) - \PPa± = 0, (9) 


that is, 


But also 


N 1 = 


^ == 1 - 2 . 3 * 

_P_ ±_ J_ 

4.5 6.7 8.9 


1-PP+1-&1+1-S1 + 

4.5 6.7 8.9 


( 10 ) 

( 11 ) 


and equating these two expressions we have the required equation for/ 2 . The method of 
solution is by successive approximation; a trial value of/ 2 is substituted in (11), the last 
denominator retained being such that the term in/ 2 is small, and N x is worked out from 
(11). Then (10) gives a second approximation to/ 2 and we repeat the process. As a check 
an extra denominator can be retained, and if the result agrees with the previous one to the 
desired accuracy it can be taken as correct. 

16*09. This method of treating solutions with three-term recurrence relations between 
the coefficients is not confined to power series. An important example is Mathieu’s 
equation 

+ (4a - 16g cos 2x) y = 0, 
ax 2 


( 1 ) 
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where q is real and a is a function of q chosen so that one solution has period 2n. The equa¬ 
tion arose first in the study of the oscillations of an elliptic membrane; it also occurs in the 
oscillations of water in an elliptic lake and has generalizations to wave motion in any 
periodic structure such as a crystal. Periodic coefficients occur also in Hill’s method of 
treating the moon s motion and in the motion of a pendulum on a vibrating support, but 
the solutions then have not necessarily the same period as the coefficient. 

The periodic solutions of Mathieu’s equation are all either symmetrical or antisym- 
metrical about 0 and \n\ that is, they have the same symmetry properties as cos x, cos 2x, 
sin x, sin 2#, to which they reduce when q = 0 and 4a = 1 or 4. We assume them to be 
expressible by convergent Fourier series. Take for instance 


V = s ^ 2 r pos 2r*. (2) 

00 

Then -£J2r) 2 A 2r cos 2rx + 4a^A 2r cos 2 rx -8q£A 2r cos (2r-2)x 

_ . - 8q % A 2r cos (2r+2) x = 0. (3) 

Equatmg coefficients we have for r ^ 2 


^2r-i + (r i -cc)A 2r + 2qA 2r+2 = 0 , 
= A 2r+2 fA 2r , 

2 qN' = -(r*-a)-p-. 

ly r-l 


(4) 

(5) 

( 6 ) 


Hence if N r _ x is not approximately - 2qfr 2 , N r will be large of order r 2 , and the series will 
diverge exponentially. This contradicts the hypothesis. Hence we rewrite (6) as 


K-i = - 


r*—a 

~2f 


+K 


and proceed as for the tidal equation. 

For the constant terms we have 

aA 0 -2qA 2 = 0, 

and for the terms in cos 2x, since cos (~2x) = cos 2x, 


(?) 

( 8 ) 


Eliminating A 0 we have 

whence 

and also, from (7), 


4qAQ+(\—<x)A 2 + 2qAi — 0. (9) 


/8 q 2 

I-j_ 1 —a, 

\ a 

| A 2 + 2qA i = 0, 

(10) 

iV — — — 
x ~ A ~ 

■ft- 2 

8 q 2 Jcc +1 — a 

2 q * 

(11) 

1 

i l 

(12) 

4 —a 

9 —a 16 —a **“ 

2 q 

2 q 2 q 



Equating these two expressions we have an equation for a for given q. Analogous methods 
can be applied to the functions expressible by odd cosines, odd sines, and even sines.* 


* Goldstein, Camb. Phil. Trans. 23, 1927, 303-36. 
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16*091. A three-term recurrence relation also occurs in the solution of the equations 
that arise in the solution of Schrodinger’s equation for the hydrogen molecular ion. The 
equation is separable in spheroidal coordinates (cf. 18-063) and for a 2-state the equations 

are [(r 2 - i) ^] + [4 + 2Rv-pW] X = 0 

and |_[ (1 _^ ) g + [ _^ V]:r = 0 > 

where A, B and p are constants. 

These equations have a considerable literature. The reader is referred to a paper by 
W. G. Baber and H. R. Hasse,* where other references are given. 

16*092. Infinite determinants. Differential equations, especially with periodic 
coefficients, can often be treated by a method introduced in the theory of the moon’s 
motion by G. W. Hill, and developed by E. W. Brown. We illustrate it by an example. 
A light rod of length l stands on a support, and carries a mass m at its upper end. The point 
of support is constrained to vibrate vertically, its displacement being a cos nt; a/l is very 
small but an 2 (g is not necessarily small, so that n 2 is large compared with gjl. If 0 is the 
inclination of the rod to the upward vertical we easily find the equation of motion, to 
the first order in a. 

Id = (g — an 2 cos nt) 6. (1) 

This is of Mathieu’s type, but we do not assume the motion to be periodic in time 2n/n. 
On the contrary, we try 

6 = e 1 ? 1 S b m e imni , 6 = -£(y + mnf b m e i ^+ mn '> t , (2) 

— 00 

(g - an 2 cos nt) 0 = e *? 1 2 {gb m - ian 2 (b m+1 + 6 m _i)} e imrU , (3) 

and (1) will be satisfied if, for all integral m, 

{(y + mnf l + g}b m = \an\b m+1 + b m _ x ). (4) 

The equation for y can then be written 

. =0. (5) 

— \an 2 ('y — 2nfl + g —\an 2 0 

0 —\an 2 (y — nfl + g — \an 2 0 0 

0 0 — \an 2 y 2 l+g —\an 2 0 

0 0 0 — \an 2 ( y + nfl + g —\an 2 


The convergence of such infinite determinants is a matter for discussion,f but often irrele¬ 
vant because, even if the determinant does not converge, it can usually be made to do so by 
multiplying rows and columns by suitable non-zero factors, which will affect neither the 
roots nor the method of solution. It is obvious that if y is a root, y + rn is another, where 

* Proc. Camb. Phil. Soc. 31, 1935, 564-81. 

f Cf. Whittaker and Watson, Modem Analysis, 1915, pp. 36, 407-10; Riesz, Equations lineaires 
d une infinite d'inconnues, 1913. 
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16*10 Solution of complex integrals 

r is any integer; but this also is irrelevant because it simply makes the diagonal element 
with the smallest coefficient of l come at m = -r, and the solution is unaltered. 
A solution can be found by the continued fraction method, taking M m = bjb m _ x for 
ra > 0 and N m = bjb m+1 for m < 0. 

For n = 0 we know that y is purely imaginary. Let us see whether it can become real 
if an 2 jg is large; if it can, the system becomes stable. The transition will be at y = 0, and 
then, taking m = 0,1, -1 in turn, we have from (4) 

6o = |^(6i + 6-x). (6) 

z 9 

(n 2 l + g)b x = \an 2 (b 0 + b 2 ); ( n 2 l + g ) b_ x = %an 2 {b 0 + &_ 2 ). (7) 


Then it is possible to have b 0 >b 1 >b 2 ..., b 0 >b_ 1 > b _ 2 ... 
then, neglecting b 2 and &_ 2 , we have 

2(\an 2 ) 2 
g(n 2 l + g)’ 


if l is large compared with a; and 


( 8 ) 


that is, a 2 n 4 - 2gln 2 - 2g 2 = 0. (9) 

The positive value of n 2 is given by 

ahi 2 = gl+ ( g 2 l 2 + 2g 2 a 2 ) i = 2 gl, ( 10 ) 

so that the maximum velocity of the point of support, an, is the velocity due to a fall 
through a distance l. Then an 2 {g is large, as expected, and we have, nearly, 

6i = 6 -i = 5T 6 «' 6 3 = 5 -a = 5T 6 i. (11 > 

and the coefficients decrease rapidly with increasing | m |. Hence the inverted pendulum 
can be made stable by giving a sufficiently rapid small vertical vibration to the point 
of support. 


16*10. Solution by complex integrals. This method consists in substituting for 
y in the differential equation an integral of the form J Te xt dt, where T is a function of t, 
a complex variable, and the limits are fixed. Integration by parts may then yield a differ¬ 
ential equation for T. Take for instance Bessel’s equation 

x ^c{ x t) + ix ’‘- n ‘ )y = 0 - (1) 


With the suggested substitution this becomes 


that is, 


\y[ x 7x( x U + ^- n * ) } ,i * dt=0 ’ 

J T(x 2 t 2 + xt+x 2 — n 2 ) e^dt = 0. 


( 2 ) 

(3) 


The problem is to find a form of T such that this will hold for all values of x, at least within 
a certain range. At present we take x real and positive. We integrate by parts twice, using 

xe^dt = die 3 *), (4) 


and get 


^{xPT -t 2 T’-tT + xT - T'} ^ ^{(t 2 + 1)T" + 3 tT' + (1 — n 2 ) T] dt = 0. (5) 
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Bessel's equation 16*10 

This will hold if (1) the integrated part vanishes, which will be true if, for instance, the 
function is single-valued and the path closed, or if it is open and the real parts of A and B 
are infinite and negative, and the factor involving T behaves like a power of t when j 1 1 
is large, and if (2) 



(f 2 +1 )T" + 3 tT 4- T = n % T. 

(6) 

Put 

T = w/V(l + « 2 ). 

(7) 

Then 


(8) 

and if 

dt , 

V(« 2 +i) = rfB ’ 

(9) 


d 2 u „ 

** “ 

(10) 

Then 

e = Iog{« + v'(< 2 +1)}, 

(11) 

u = {t+J(t*+l)} ±n , 

and possible forms for y are 

(12) 


r B e xt dt C B e xt dt 

(13) 


where A and B lie at infinity in the third and second quadrants respectively. This con¬ 
dition is satisfied by the path M of 12*126. 

Solution by series gives, except when n is zero or an integer (taken positive), the two 
solutions oo /l~\n+ 2 r oo (l x )-n+ 2 r 


These are independent even if n is half an odd integer, though the roots of the indicial 
equation then differ by an integer. We compare them with the complex integral solutions 
by expanding in descending powers of t. The first integral has for its first term 


J 

J M 


e^dt _ .x n 

- = 2m -, , 

t n + x n ! 


and the second similarly 2ni 


(—n)\ 


Both give series in ascending powers of x. Hence 


Jn{x) 2mj M J(t* + 1){< + V(«* +1)}“’ 

j M.lf {t+M±l 

V(* 2 +!) 

It follows that if n > 0, x > 0, J n (x)H{x) has the operational form* 


e^dt 


(16) 

(16) 


V(i> s +i){i>+V(y a +i)}“ ff(x) ’ (17) 

but J_ n (x) has no operational form if n > 1; for the expansion of the corresponding 
operator would start with p n , which we cannot interpret by the fundamental rule, and if 

* Originally obtained by the method of 21*01: H. Jeffreys, Operational Methods in Mathematical 
Physics, 1927. 
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16*11 Conversion of series into integrals 

we apply the Bromwich integral it diverges. The modification of Bromwich’s path to 
make 91(f) tend to - oo at the ends is essential to make the integral for J-Jx) converge. 

Numerous interesting applications of similar methods have been given by B. van der 
Pol, (15) and (16) in particular being found.* But he has assumed from the start that 
the function to be found has an image derivable by the equation 

F(z) = J zf(x) e~* x dx t 

which is meaningless unl ess convergence conditions are satisfied. It is significant if 
f(x) = J n (x), but not if f(x) = J- n (x) with n> 1. Yet his process yields the above differential 
equation for T, and he gives the two solutions, one of which cannot be evaluated by the 
Bromwich integral, which he states as a fundamental rule at the outset. The matter is 
further confused by his use of the expression ‘operational solution’. The method used is 
not operational, his p being defined as a complex variable and not as an operator. The 
fact is that the operational method and the method of substituting contour integrals, 
though they have a certain formal similarity and though there is a domain where they 
are both applicable, can be used in different regions outside that domain, and cannot be 
interchanged indiscriminately. To give a meaning to the integral for J_ n (x) it is necessary 
to modify the Bromwich path by making its ends proceed to infinity in directions between 
the imaginary axis and the negative real axis. But such a modification in problems of 
small oscillations would give the wrong result in every case where there are infinitely many 
real periods tending to zero. In fact van der Pol, in getting an expression for J_ n (x), has 
used an equipment insufficiently strong to catch his fish; but the fish has jumped on to 
dry land beside him. 

If x is complex the series solutions are single-valued if we take — tt < arg x^it. If w is 
integral we need no restriction on arg x, but we shall see in Chapter 21 that J_ n (x) 
and J n (x) are not then independent. The complex integrals, however, need to be 
modified in such a way that 9l(a;f)—>■—oo at both ends of the path. We can ensure con¬ 
tinuity by making arg t at the ends vary continuously with arg x, and the integrals will 
then always agree with the series. 

The integrands have branch points at t = + i. The integral along any loop passing 
around one of these and passing to infinity so that 9l(art) ->—oo will give a solution of the 
equation. Two other solutions can be found in this way, and are the important Hankel 
functions. They are of course not independent of the two solutions J n (x) and J_ n {x), which 
can for some purposes conveniently be expressed in terms of them (cf. 21-02). 


16*11. Conversion of series into integrals. A series can sometimes be converted 
into an integral directly by means of the rule 


1 r p zx r* 


where for x real and positive AT is a path with termini at in fi nity in the third and second 
quadrants and cutting the positive real axis. If we apply this to the series for J n (x) we get 


JLf eSX ~, Y (n + 2r)\ 1 

27Ti) M z r=o ' r S {n + r )! (2 z) n +* 


( 18 ) 


* Phil. Mag. (7) 8, 1929, 861-98; 13, 1932, 637-77. 
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Linear dependence 16*12 

which, must be identical with (15); a direct method of summing the series is given in Ex. 9, 
p. 496. Alternatively, we may consider, in operational form, 


(i xy>J n (x)H(x) = 2 = S( _ w (2» + 2r) 


1(1 

! W 


\ 2n+2r 


But we have had 
whence 


r=o r\(n + r)\ " v ' r! (w + r)! \2p) 

z!(z —£)! = 2~ 2z 7r 1 ^(2z)\, 


H(x). 


-1 c-r e±i^ (!)“**(., - (, + ±p»«„ 


(»-*)! 


(p 2 + 


\-l) n+1 l* K) 2ni Jn 


otx 


In particular 
By continuation, 


sin* 


= ^h ( x) . 

Mx)= J(v^ x - 


dt. 


(19) 

( 20 ) 

( 21 ) 


( 22 ) 


This method fails for J_y 2 (x), since (n — %)\ is then infinite and the expression (21) takes 
the form oo x 0. But direct examination of the series for it shows that 


J-y 2 {x) 


-A 


2 

7TX 


COS X. 


(23) 


16*12. The Wronskian. If y 1 ,y i ...y n are functions of x, with (n-\)th derivatives in 
an open interval, the determinant 

( 1 ) 


= 

Vi 

2/2 

y-n 


y? 

2 & ... 

y% 


y(n-l) 

2/sT -1) ... 

y ( n~ X) 


is called their Wronskian. If there are constants A 1} ...,A n not all zero such that £ A r y r = 0 

r= 1 

throughout the interval , then W = 0. This follows at once if we differentiate the given rela¬ 
tion n — 1 times and eliminate the A r between the derived relations. Conversely, if W = 0 
throughout the interval and the minor of at least one of tfp~ X) nowhere vanishes, there are 

n 

constants A r such that 2 A r y r = 0 throughout the interval. Suppose that the minor of 

r=l 

y { n~ x) nowhere vanishes. Then the n— 1 equations 


n —1 

2 Bry^ + yn = 0 («5 = 0, 1, ...,n — 2) 

r— 1 


( 2 ) 


yield a set of quantities B r for each x, and since W = 0(2) holds also for s=n—l. Further, 
the B r are differentiable, since */ ( r n - 2 > is. Differentiate each of (2) and subtract the next. 

Th ™ —i 

(3) 


SW = o (« = 0,1 . n-2). 


r=l 


But since the determinant of the y® nowhere vanishes, it follows that all B' are zero. 
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The need of the condition that the Wronskian of some set of n— 1 of the functions is 
nowhere zero may be illustrated by the example 

y x — 0 (— lcr^O), y x = exp( — 1/x 2 ) (0<£<1), 

y 2 = exp( — 1/a: 2 ) ( —lc»<0), «/ 2 = 0 (0^ic<l). 

Here y^i-y^y* = 9 throughout the interval, but there are no non-zero A x , A 2 such that 
A x y x + A 2 y 2 = 0 everywhere. 

If the y r are given to be analytic functions and there is a region where the first minor 
of some element of the last row is nowhere zero, the existence of a linear relation in this 
region follows, and can then be extended by analytic continuation to all points of the 
common part of the regions of definition of the y r . 

16* 13 . liy x ,y 2 ,y n are solutions of a differential equation 

tf n) + T l fn-r(^)y {n - r) = 9 ( 1 ) 

r= 1 

with no linear relation with constant coefficients connecting them, the derivative of their 
Wronskian is seen to be obtained from W by replacing ^ n-1) by Substituting for 
«4 n) from the differential equation and subtracting suitable multiples of the other rows 
from the last we find , w 

= (2) 

W = A exp | - J f n ^(u) du j, (3) 

where A is constant. Hence if there is an x where W 4= 0, W does not vanish anywhere. 
If fn-i( x ) — 0, W is constant. For a second-order equation with one known solution y x 
this leads to an easily soluble first-order equation for y 2 , equivalent to the elementary 
solution by putting y = y x z. 

16*14. Variation of parameters. Let the differential equation be 

% + f (x) tx + g(x)v ~ 8 ’ (!) 

and suppose that we have two independent solutions of the equation when 8 = 0. Denote 
these by y x and y 2 . Then if y = Ay x + By 2 , where A and B are constants, we have the most 
general solution with 8 = 0. The method of variation of parameters consists in making 
A and B variable functions of x and choosing them so that the equation can be satisfied 


for a general 8. We take, then, 

y = P{x)y x + Q{x)y 2 \ (2) 

then y' = P r y x + Q'y 2 + Py' x + Qy' 2 . ( 3 ) 

As we have introduced two new functions P and Q we are entitled to assume one relation 
between them: we take 

Py 1 + Q'y 2 = 0. (4) 

Then y" = Py" x + Qy\ + P'y' x + Q'y' 2 , ( 5 ) 

and substituting in (1) we have 

{Py'l +f( x ) Py'x + g( x ) Pyd+{Qyl +/(») Qy'% + g( x ) Qy z ] + P'y[ + Q'y' % = 8. (6) 
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The terms in brackets cancel because y x and y % satisfy the equation with 8 = 0; hence we 
have two equations to determine P' and Q' in terms of 8. Then 


P' = 


^2 


Q' = - 


Syi 


ViVi-ViVi ViVt-ViVi 

which are definite if y x and y % are linearly independent. Hence 

' x yi(x)y*{Z)-y*{x)yi{Z) 


y = Cy x +Dy 2 + (* ^ 

Jay 1 


y'i(Z)y2(Z)-y' z (Z)yi(Z) 


maz, 


(?) 

( 8 ) 


where C and D are constant, a may be taken arbitrarily, but a change of its value only 
adds multiples of y x (x) and y 2 (x) to the solution, and therefore is equivalent to altering 
C and D. 

It is easy to verify that (8) satisfies (1). We have by differentiation 


Oyi+Osi+jl 


y'x{x)y2.(Z)-y%{x) yi(Z) 
y , i(Z)y2(Z)-y' i (Z)y 1 (Z) 


8 (Z)dZ 


(9) 


(differentiation of the limit yielding nothing because the integrand vanishes there); 


y 




Cyl+Dyl+r 


y'i(x)y 2 (Z)-yl(x)yi(Z) 


S(Z)dZ+S(x), 


( 10 ) 


since the integrand in (9) is S(x) when £ = x. Substituting in (1) now gives an identity, 
by the definitions of^, y 2 . Further, since y x y 2 - y x y 2 =4= 0, the constants C, D can be chosen 
to make v and y’ take any prescribed values at x = a , and therefore (8) is the most general 
solution. In comparison with the method where only one solution y x is known and we 
assume y -y x u, the further integrations with this method are usually much easier. 

Take in particular y" -{-fity = 8, 

with y x — cos nx, y 2 = sinwa. Then 

2/12/2-2/22/1 = -% 


If* 1 . f* 

y = — ~Go&nx\ S{Z)mnnZd£—^Bm.nx\ S(£)coanEd£ 

1 f* 

— S(Z)8mn{Z — x)d£ + Aooanx + Bamnx f 
nj o 


which is easily shown to satisfy all the conditions. 

This method, in an extended form, is the basis of the methods used for the calculation 
of planetary orbits. Without the disturbance due to other planets, the motion of any 
planet would be an ellipse, specified by six constants determined by the position and 
velocity components at one instant. To allow for perturbations these constants are taken 
as variables; that is, at any moment we can speak of the instantaneous ellipse as the orbit 
that the planet would describe if the perturbations were removed and the planet moved 
under the solar attraction alone, with the initial position and velocity that it has at that 
instant. Perturbations will prevent this from being always the same orbit, but the 
changes are slow and determinate, and from them the position of the planet at a future 
instant can be inferred. 

It is not necessary for the validity of the method that 8 shall be a known function of x. 
If S also involves y (8) will still be true, though not immediately informative since we do 
not know what values to take for 8 in the integral without knowing y first. In such a 
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case (8) becomes an integral equation for y, since the unknown function y occurs under 
the sign of integration. We consider equations of the form 

~ + g{x)y = h(x)y (11) 

and take y x , y 2 to be solutions of y" +g{x)y = 0 such that y'iy^—y^yx— 1. Then 


y = A x y x + A 2 y 2 +\ {y x {x)y 2 {£)-y 2 {x)y x {£)}h(£)y{£)d£ 

J a 

-/(*)+ dS (12) 

with K(x,£) = {y 1 {x)y^)-y i {x)y 1 {^)}Mi). (13) 

This can be solved by an iterative method. First suppose K(x, £) to be bounded in an 
interval a^£^x^b. Put \K\<M. Let the upper bound of | f(x) | in the interval be N. 
Substitute /(£) for y{£) in the integral in (12); then if the integral is f x (x) we have 

\f x (x)\^MN(x-a). (14) 

Now substitute f x (x) in the integral in turn; if the integral is then f z (x) we have 

( 15 ) 

Thus we find y = f(x) +f x (x) +f 2 (x) +..., (16) 


which converges like the exponential series N exp{Af (# — «)}. Substitution then shows 
that the sum actually satisfies (12). This is a practical method of solution if y x and y 2 are 
known and h(x) is small. 

If y x , y 2 , h are analytic in a bounded region of x, including x = a, and for any x of the 
region there is a straight line in the region connecting a and x, we can take this straight 
line as the path of integration, and a similar argument holds. The solution in the region is 
then a uniformly convergent series of analytic functions and therefore analytic. 

The method still succeeds if y is required to attain given values at two fixed values of#; 
for A x and A 2 can be redetermined at each stage so that each approximation satisfies 
the terminal conditions exactly. Generally speaking it determines a convergent series 
solution, but leaves some liberty of choice of the stage when it is convenient to turn to 
arithmetic. It is, however, usually longer than direct numerical solution by finite 
differences. 

16*15. Green’s function. This is a method closely related to the last, but directly ap¬ 
plicable to problems where a solution is required to take definite values at fixed termini. 
It can be extended to two and more dimensions, but has too many ramifications to discuss 
here. It has much theoretical importance because it enables a differential equation, with 
suitable boundary conditions, to be converted into an integral equation. Accounts are 
given by Courant-Hilbert,* Websterf and Temple & Bickley.J The problems of 6*091, 
6*092, 6*093 and 14*05 are particular cases of it. 

* Methoden der Mathematischen Physik, 1, 1924, 273-99. 
f Partial Differential Equations of Mathematical Physics, pp. 109-42, 222-38. 
t Rayleigh’s principle, 1933. 
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EXAMPLES 

1. Find the general series solution of 

y*-xy' + 2y = 0. 

Show that one solution is a polynomial, and deduce the other solution in finite terms. (I.C. 1942.) 


2. If 

prove that 
and hence that 

and 


y — 


sinn# 
cos \Q 


, x = sin \ a. 


<i- * !) § -3 *S +<4 ’* a - i)s ' =o ’ 


x + 


4(1 2 —n a ) „ 4 a (l a ~n a ) (2 2 —n a ) . 

. X s + ———-- 


3! 


5! 


ar +. 


V = 2n 

cos nd l 2 —4 n 2 „ (l 2 —4n 2 )(3 2 —4n 2 ) 

_ —1-1_ -y-2 I i_li_' ~.i _|_ 

cos£0 2! 4! 


Explain why these series have radius of convergence 1 if 2n is not an integer, 
3. If 


ii -* ) S‘~ 6 *‘2~ 6xy=o 


and if y = 1, y' = 0 when x = 0, find a series for y. Sum the series and verify that the sum satisfies 
the differential equation. (I.C. 1936.) 

4. Prove that if x = sin \d, y = cos nd, 

(1—a; 2 ) y” —xy' + 4n 2 y = 0. 

Hence prove that cos nd = 1 — — (2 sin \d) 2 ^-—-- (2 sin £0) 4 —..., 


sin nd 
cos \d 


2 n sin \Q 


n(n 2 — l a ) 
3~i 


(2sin£0) 3 + .... 


6. What is the least number of steps required in the continuation in 16*05 when the continuation 
is carried out (1) by power series, (2) by the method of 16*03, the successive origins being equally 
distant from the singularity? 


6. If a linear differential equation of the second order has 0, 1, oo as regular singularities, and no 
other singularities, show that it is reducible to the form 

c + dx + ea; 2 

x(\-x)y" + {a + bx) y'y = 0. 

Show also that if at each of 0, 1 one solution has index 0, the last term reduces to — ey. 

7. Solve completely the equation 

x 2 (l+x 2 )y'-2y = 2x*. (M.T. 1936.) 

8. If y" + Py , + Qy = 0, 


where P and Q — q x jx are analytic at x = 0, where q x =i= 0, prove that one solution of the equation 
always involves a logarithm. 

x 

9 * If y = {x+J(x 2 -l)} n y/(x 2 -l) 

and x — l/£, prove by putting x = cosh u that 

1 ~ g2) % + & ~ ^ 3) % ~ (2 ^ + n2) y = 0t 
00 (n + 2r)! / 1 \ n+2r 

Hence prove that y = S- r. I I (I x I > 1). 

^ 9 r =o r\(n + r)\ \2xj 1 - 
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10. Show by the method of Frobenius that two solutions of 


d 2 y 

*W + y 


0 


Vi — X ~T~n Jr 


and 


1.2 1.2 2 .3 

( — l) m x' 


-...+ 


(_l)tn £C m+l 


1 + p •“ l 2 .2 2 ...(m— l) 2 m 


1.2 2 .3 2 ... (m — l) 2 m *’* 

(2 2 2 1 \ 

T + 0 + -- +-; H—I — 

\1 2 m — 1 m) 


11. Show that two solutions of 


d*y dy ^ y _ Q 
dx 3 dx x 


•> , , X 3 x m 

are x and \ — x\osx -...-.... 

2! (m—1) ml 


12 . If y" + x( x ) V = 

where x( x ) is an integrable function with period 2n, prove that the solution has in general the form 

y = <7 1 e Ax ^ 1 (a:) + C 2 e-**(f> 2 (x), 

where <fi\(x) and <f> 2 (x) have period 2n\ and that in the exceptional case 


y = D x f x (x) + xD 2 ifr 2 (x), 

where tfr x and i/r 2 have period 2 n. 


13. If 


dx 



+g( x )y = o 


where f(x), g(x) are analytic in a neighbourhood of x Q , and f(x 0 ) 4= 0, prove that the roots of the indicial 
equation at x 0 always differ by an integer but that the solution never contains a logarithm. 

14. If the Wronskian of two solutions y x , y 2 of a second-order linear equation is W, prove by direct 
transformation that the solution 

C x W dx 

is a linear combination of y x and y 2 . 

If W = 1 everywhere and J y x | is large compared with x^ +6 when x is large, e being positive, show 
that there is a second solution y 2 that is not large compared with x^ i+e , and that taking A x = 1, c = oo 
in the above expression leads to y 3 = — y 2 for any A t . 


JMF 
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Chapter 17 

ASYMPTOTIC EXPANSIONS 


Up the airy mountain 
Down the rushy glen. 


WILLIAM ALLIN G HAM 


17*01. Nature of asymptotic expansions: incomplete factorial function. 
The incomplete factorial function is defined by the integral 


J X 

where x and n are positive. Integrate by parts; we have 
I = e~ x x~ n — n J e~ 4 t~^ n ~ 1 dt 

J X 

= e~ x {x~ m — nx~ n ~ 1 + n(n+ 1) x ~ n ~ 2 ... + ( — ) r n(n + 1)... (n + r — l)x~ n ~ r } 

— ( — ) r n(n + l)...(n + r)f e- t t~ n ~ r ~ 1 dt. 

J X 


( 1 ) 


( 2 ) 


This is exact. Now the integral in the last term is always positive, but its coefficient 
alternates in sign with successive values of r. Consequently the error committed in neg¬ 
lecting it alternates in sign, and therefore I always lies between the sums of r and r +1 
terms of the series. But the ratio of the term in x~ n ~^ to the preceding one is — (n + r — \)]x. 
If, then, x is large compared with n, the terms will decrease to a minimum and then in¬ 
crease again. If we stop at the smallest term but one we shall know that the error is less 
than the next term, which will be a small fraction of the sum. Thus we can get a good 
approximation to the value of the integral. Nevertheless the terms for a general r are 
the terms of an infinitely oscillating series. The properties are similar to those we have 
already found for several approximations based on the Euler-Maclaurin expansion. 

Such a series is called an asymptotic expansion. It is really not correctly regarded as 
an infinite series at all, and some confusion has arisen from the expression * use of divergent 
series ’ in relation to such expansions. It is to be regarded as the sum of a finite number of 
terms, stopping either at the smallest but one or at some earlier one when we have already 
achieved as much accuracy as we want. In suitable circumstances the accuracy may be 
very high. But unlike a convergent series, which will theoretically always give as much 
accuracy as we want if we take enough terms, an asymptotic series is definitely limited 
in accuracy; if we take more than a certain number of terms we increase the error again. 
The terms of convergent series often decrease from the start, as for the series 


I +¥ + ¥ + FT+ •••» i + i“3T + 


For the second of these the error committed in stopping at any term is less in magnitude 
than the first term neglected; for the first it is less than the last term retained. But in 
the series 

, 10 4 10 « 
i + ioo+-2T+-jjT + ..., 
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though it is convergent, and accordingly defines a particular number quite precisely, 
the terms increase up to the hundredth, and enormous labour would be needed to sum it 
by direct computation.* For calculation an important property of a series is rapid 
decrease of the early terms, and successive sums can be regarded as successive approxi¬ 
mations. This property may not be possessed by a convergent series, and may be pos¬ 
sessed by a divergent one. But successive approximation is a necessary feature of scientific 
work, and is used at the stage of most calculations when the results are reduced to quan¬ 
titative answers. We seldom aim at exact answers; what is desirable is to have some idea 
of the accuracy of the answers we do get, and this is given in a most convenient form by 
such a series as (2). 

The Euler-Maclaurin formula is in general asymptotic. If the function operated on is 
a polynomial the series terminates and there is no more to be said. But if the function 
contains a fractional or negative power of the argument the higher derivatives acquire 
a pair of factors of the form (n — 2r)(n — 2r—l)/x 2 at each step; while the coefficients 
b 2r decrease only like (27r)“ 2r . Hence, however small the interval used may be, the terms 
will ultimately increase indefinitely on account of the accumulation of factorials in the 
numerator. Consequently the series, if regarded as an infinite series, is usually divergent. 
Yet the high apparent accuracy obtained by using it can be justified by the method of 
9*08. 

17*02. Poincare’s definition. The usual definition of an asymptotic approximation, 
due to Poincar6, is that if f(z) is an analytic function, S n (z) is the sum of the terms up to 
j4 n 2 -n of the series . . * 

S(z) — A 0 +-^+ .••+-— + ^ (1) 

and if B n (z) = f(z) — S n (z), the series is called an asymptotic expansion of f(z) within a 
given interval of arg z if for every n 

lim z n R n (z) = 0. (2) 

I Z I -> 00 

We write f(z) (V S{z). (3) 

A power series in 1/z that converges for \ z\ > R satisfies this definition of an asymptotic 
expansion. For there is an M such that the remainder after the term in z~ n has a modulus 

less thanr^ 

l* 

17*021. Asymptotic series can be multiplied unconditionally. For if 

^.( 2 ) = A„+-^ + ... ( 1 ) 

T n (z) = B 0 +^ + ... + ^, (2) 

are asymptotic representations of f(z), g(z), we can choose z so that 

I * n {/(z) - ^n(2)} I * I z n {g(z) - T n (z)} | 

are arbitrarily small. 

* Actually, of course, we should work out 1001og 10 e and then evaluate by means of a table of 
logarithms to the base 10. When a multiplying machine is available two uses remain for logarithms to 
base 10; to work out high powers and logarithms to base e of large numbers. 


I/-K- 


- , for all values of arg z. 


32-2 
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General theorems 


17 * 022 — 17 * 024 ! 


Also SJz) T n (z) = C 0 + ^ + ... + ^ + 0 ( 2 -») = U n (z) + o( Z ->), (3) 

where C m = A 0 B m + A 1 B m _ x +... + £ 0 , (4) 

and z n {f(z) g(z) — U n (z)} is the sum of three terms that tend to zero with 1 jz for all n. 


17*022. An asymptotic expansion can also be integrated unconditionally. If for 
a ^ arg z ^ ft, \z\>B, f(z) is analytic, and 

|2 n {/(2)-^n(2)}| <", 

and z 1 satisfies the inequalities, take a path from z t to infinity with constant argz. Then 


!/"{/(*)-*M}*| <£,£* = 
^ _1 { £ /(*) dz ~j z S n( Z ) dz \ 


that is, 

/*co 

so that the term by term integration of S n {z) gives an asymptotic expansion of f{z) dz. 

J 

If Zj and z 2 have the same modulus and we take a circular arc about the origin to connect 
them, this arc is of length L<2n\z 1 \, and 


j : 


{M-S n {z)}dz 


(oL 2 tt(i) 
<1- rz < 


n—1* 


and the same result holds. Since/(z) and S n (z) have no singularity in the region we can 
connect any two points in the region by a path partly of constant argz and partly of 
constant | z | without altering the integral, and the result follows. 


17*023. Asymptotic expansions are unique. For if we have for all | z | > B, a < arg z</? 


we should have lim 2 "(a o — B 0 +~——+ ...+ ^ n n n | = 0 (2) 

a-> oo l 2 z ) 

and therefore A 0 = B 0 , A x = B x , ..., A n = B n . (3) 

It follows that f(z) can have an asymptotic expansion of the form 17*02 (1) for all values 
of argz only if the series converges. For if 17*02 (2) held for all values of argz we could 
choose quantities M, B such that | z n R n (z) | <M for all | z | > B; and then z n B n (z) would 
have a convergent expansion in powers of 1/z by Cauchy’s inequality. Hence /(z) would 
have a convergent expansion with the asymptotic property. But then it follows that the 
only asymptotic expansion of /(z) is the convergent series. 


17*024. The converse is not true; the same expansion in a given region may be an 
asymptotic expansion of several functions provided that their differences f{z) — g(z) 
satisfy for every n Um 2 « (/(z) _<,(,)} = o. 

> CO 

This could happen, for instance, if the range of argument was — \tt to \tt and 
f(z) — g(z) = e~ z . Poincare’s definition therefore does not fix limits to the error for a given 
z, and these are usually found by special methods. 



17*03 


Watson's lemma 


501 


17*03. Watson’s lemma. Two of the most important methods of obtaining asymptotic 
expansions are the method of steepest descents, due to Debye, and that of stationary 
phase, due to Kelvin. They are largely, but not completely, equivalent. We need first 
a form of a lemma due to G. N. Watson.* Consider the integral along the real axis 


=j Z e~ az z m f(z)dz 


( 1 ) 


where/(z) is analytic on the path and not zero at z = 0; Z is independent of a, and may 
be infinite; a is real and positive; and I exists for some a, say a. Hence m> — 1. Then 
within the circle of convergence of the series expansion of f(z) 


f(z) = a 0 + a x z+... + a n _ x z n ~ x + R n {z ). 


(2) 


where R n (z)/z n tends to a finite limit as z->0. Take a fixed A in (0, Z) within the circle of 
convergence; then 

r i*a rzn 

(3) 

(4) 


In (0, A) the function 


= f + f e~ az z m f(z)dz. 

_Jo J a J 


g(z) = {f(z) — (a 0 + a 1 z +... + ar» 

is bounded. Let the upper bound of its modulus be M. Then 


I a =J* o e~ az z m f(z)dz 

=J er az z m {a Si + a x z +... + a n _ x z n-1 ) dz 
+J o e~™z m 6Mz n dz, 

where 1 0 1 < 1 . Now if z — A(l+u)<Ae v 

<*oo foo 

J e-^z^dz < J A m + r + 1 e- Aa ~ aAu e (m + r ) u du = A m + r + 1 e- Aa l(Aa -m-r ), 

and. the first integral on the right of (5) is 


(5) 


( 6 ) 


(7) 


The second has modulus less than M(m + n) !/a TO + n + 1 . 


For a = a let the upper bound of J e~ az z m f(z) dz 


for X in (A, Z) be N. Then from Abel’s 


lemma for integrals (since is a positive decreasing function of z) 


j: 


e~ ass z m f(z) dz 


< 2e~< a - a)A N, 


(8) 


* Proc. Bond. Mctih. Soc. (2), 17, 1918, 133; Theory of Bessel Functions, p. 236. 
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and therefore 



(' m + r ) ! 
rto r a m + r+1 


M(m + n)\ 

Qin+n+X 


+ Ker aA i 


(9) 


where K is independent of a. If we multiply by a m+n and make a->oo the right side tends 
to 0. Hence 



(m + r)\a r 

a m+r+l * 


( 10 ) 


There is no reason against the upper limit Z being infinite, since the argument from 
Abel’s lemma still applies so long as the integral converges for some a. It is also permissible 
for f(z) to be unbounded at some point or points so long as the improper integral exists. 

Watson took/(z) bounded on the whole of the real axis and Z infinite. These conditions 
are often satisfied, but not always, and the slight extension we have made seems worth 
while. 

The series is ultimately divergent if/(z) has a singularity at a finite distance R from the 
origin. For if R' > R there exists a Jc such that | a r | > k/R' r for an infinite number of values 
of r, and the terms in (10), for any given a, are unbounded with respect to r. 

The lemma proves the existence of an asymptotic expansion in Poincare’s sense, and 
determines the coefficients. It does not provide an estimate of the error for given a in 
stopping at a given term, since we have not assigned a value to M. An idea of the accuracy 
of the sum up to the smallest term can be got by a method related to what J. R. Airey 
called ‘use of convergence factorsThe principle is illustrated most easily by our first 
example. The integral in the remainder term was 





and if n + r — x the next term is numerically equal to the last kept. But if we take the 
logarithmic derivative of the integrand we get 

d f . . n-\-r 

-{-t-{n + r+l)\ogt} = -1---, 


and the two terms are nearly equal when t = x. Hence the integral is nearly 

e~ x x ~ Jn ~ r ~ 1 J e~ iu du — \e~ x x~ n ^~ x , 
and the remainder term is nearly 

(— ) r+1 \n{n +1)... (n + r) e~ x x~ n ~ r ~ 1 , 

which is half the next term in the expansion. Thus a very substantial improvement will 
be made if the asymptotic series is computed up to the smallest term, and half the next 
term is added. Greater accuracy still is obtainable by expanding {t + (n + r + l)logt} to 
higher powers of (t—x)/x. We shall return to this point in Chapter 23. 


* Phil. Mag. (7), 24, 1937, 521-552. 
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( 11 ) 


The ease we chiefly need is I = j e~ 1/a62 **/(z) dz, 

where A, B are positive and independent of 6, and/(z) still has an expansion given by (2) 
near z = 0. We put z 2 = 

I = I f 1 ' e -WC/( - £W) £-V><ZJ + if" e-Wf/ir'*) (12) 

The odd powers cancel and 


2 +1.3^+... + 1.3... (2w— 1) 


b 5 


fo2n+l J T 


(13) 


17*04. Method of steepest descents. This is due to Debye, and is applied to the 
approximate evaluation of integrals of the form 


= [x( z ) emz)dz > 


( 1 ) 


where t is large, real and positive, and f(z) is analytic. We write 

f{z) = + (2) 

separating its real and imaginary parts. <f> and i/r both satisfy Laplace’s equation, and the 
integrand will be large where <j> is algebraically large. The transformation from x, y to 
<j), \Jr will be non-singular in a region containing no singularities or zeros of/'(z). In such 
a region we can pass from A to B by a finite number of steps along lines of 0 or ^ constant. 
Put/(z) = £. Then 

(3) 


-r r 

J g**A J K z ) Jz**A 


Suppose first that g(g) is analytic in the region. Then gr(£) has a bounded derivative in the 
region and its real and imaginary parts separately will be of bounded variation on a finite 
path of <j> or ^ constant. We can then apply the inequalities derived for integrals in 1 * 134 a. 
If the path from A to B is one of \\r constant, and <p A > <j> B , the path from A to B is called 
one of steepest descent. Then 

7 = f g{£)df>. (4) 

J z=A 

If the path is one of <j> constant 

l = i[ B e^eW-^giOdfi, (5) 

J 8=A 

\e-KAl\<2j2{\g(B)\ + V(B)}lt, (6) 

where V(B) is the greater of the total variations of and 3g(£) on the path. In either 

CaSe I = OWt) (7) 

subject to <f> A > <f> B . For infinite paths it is necessary to verify directly whether V{B) is 
finite. 


Since ^> A — f> \s a real variable and g{£) is analytic, we find for (4) by Watson’s lemma, 

(8) 


t i F+ 1 


If a line of 0 constant connects A, B we can in general find a path of constant <f> from A' 
to B' such that <j) A > = < $a> an< i ^a’ — ^A’ } P'b’ = Then 1 is equivalent to integrals 
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along AA', A'B', B'B, where 0 is constant on A'B' and 0 is constant on A A' and B'B. 
Then by (7) the integral along A'B' is 0(e^'/f). Those along AA' and B'B can be approxi¬ 
mated to by (8), and for all r, as t-^co, 

|^i e i(^'-^)|->0. (9) 


Hence A'B' contributes nothing to the asymptotic expansion of I (subject as before to 
and $<?(£) being of bounded variation on A'B'). 

It follows that detailed tracing of the paths of constant 0 is seldom necessary; the 
asymptotic expansion is wholly determined by the behaviour of the integrand near the 
points where 0 is greatest. 

It often happens, however, that on any path (in the z plane) from A to B there are points 
where (j) exceeds both <f> A and (j> B . In this case 0 has a maximum at an interior point of the 
path. Suppose that the part of the path that passes through this point is one of constant 
0 (it cannot be one of constant 0). Then if ds, dn are elements of length along and normal 
to the path we have at this point di/r/ds = 0, d(f>jds = 0, and therefore, by the Cauchy- 
Riemann relations, d^jdn = 0, 80/0W = 0. The point is therefore one where f'{z) = 0, 
and g(Q as defined in (3) is no longer analytic. Such points are known as saddle-points 
or cols. Another approach is to consider the maxima of \f> on all paths connecting A and B. 
If the maximum is to be made as small as possible by a suitable choice of path, we must 
have dfijdn = 0 at it; but we also have 00/0s = 0 since 0 is a maximum; and therefore 


/'(*) = 0 . 

Lines of constant 0 are called lines of steepest descent , because on them the direction at 
any point is such that | dfyjds | is as great as possible. If 6 is the inclination of the path to 


the axis of x. 


00 /j00 . a 

~ = cos#— + sin# 
os ox 


00 

dy’ 


( 10 ) 


and if this is to be stationary for variations of 0 

• ^30 a 00 • adft £,00 30 “ 

0 = -sm0_- + cos0~- = -sm0~-cos0~- = , (11) 

dx oy ay ox cs 

which is satisfied on a path of constant 0. 

In these cases, therefore, it is convenient to make part of the path of integration consist 
of a line of steepest descent through a saddle-point so that the larger values of 0 are 
concentrated in as short an interval of the path as possible. 

0 cannot be a maximum for all variations of x and y from a point. Through a saddle- 
point z 0 there will be two or more curves of constant 0, separating the neighbourhood into 
sectors. Those where 0 is less than at z 0 are called valleys, those where it is greater than at 
z 0 hills. If then A and B lie in different valleys specified by a saddle-point z 0 , the best path 
will be of the form ACz 0 DB, where 0 C = $ A , <f> D = 0 B , and Cz 0 D is a path of constant \jr 
through z 0 . In this case the approximation (8) fails because g(z 0 ) would be infinite, but this 
difficulty is overcome by a method described below. The contributions from AG and DB 
are negligible compared with that from Cz 0 D, and this is the most striking feature of 
the method. 

Isolated singularities of gr(£), if not actually on the path, do not in general affect the 
approximation, since all that matters is the upper bound of its modulus or the total varia¬ 
tions of its real and imaginary parts on the actual path. If the path from A to B is not one 
of constant 0 or xjr and is replaced by one consisting of segments of 0 or 0- constant, there 
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may be a singularity z t of </(£) between the original path and the adopted one, but an 
integral about z ± will contain a factor exp t(j){zf), which is exponentially small compared 
with exp t<f>(z 0 ). 

Lines of steepest descent terminate only at singular points of f(z) or at infinity. 

If z 0 is a saddle-point and /"(z 0 ) + 0, f(z) near it can be expanded in the form 

f(z) =f(z 0 ) + Uz-z 0 )*r(z 0 ) + ..., (12) 

and the direction of the path will be such that (z - z 0 ) 2 f"{z Q ) is real and negative. If then 
W6pUt /(*)-/(*„) = -« 2 , (13) 


and change the variable to £, the integral takes the form considered in Watson’s lemma 
and the existence of an asymptotic expansion in negative powers of i 1/a can be inferred. 
In practice, however, the inversion of series is usually troublesome, and if terms after 
the first are required they are usually found in some other way. For many purposes, 
however, the first term is sufficient, and can be obtained easily. We have 

I = e tf ( z o> lxi z ) dz 

C dz 

= eM (14) 

But if we write for values of z on the path, with r real and small 


z — Zq — re ia , 

£ 2 = ~f"( z o) r 2 e 2ia , 
(=±r\f‘(z a )\'\ 

since /"(z 0 ) e 2ia is real and negative. Then 


(15) 

(16) 

(17) 


In the range ( — tt,7t) there are two possible choices for a, and they differ by n. In any 
application of the method we have at this point to make an inspection of the behaviour 
of <j) and ijr over the complex plane in order to decide the sense in which the path goes 
through the saddle-point. If we select the value of a that makes r positive at points on the 
path after passing through z 0 , we shall have to take the positive sign in (16), as £ goes from 
— oo to + oo on the path. Then by Watson’s lemma the integral is given asymptotically by 


r X( z o)d f( * o) *J( 2 ^)e ia 
ir>o)i 1/a 


(18) 


Since t~ n exp {tf(z 0 )} will be large for all n if f(z 0 ) has a positive real part, we should 
strictly write (18) as 

(i9) 


irw 


Va 


in order that Poincare’s definition shall be applicable. We shall, however, use the form 
(18) for convenience, with the understanding that where exponential factors are present 
in the approximation such a transposition is needed before the definition is applied. 

The method restricts us to paths of steepest descent and traverses on lines of constant <j>) 
and if the termini are such that <f> must have a maximum at an interior point, the path of 
steepest descent is taken through a saddle-point, thereby making this maximum as small 
as possible. It may happen that with these restrictions any path deformable into AB will 
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pass through two or more saddle-points. If so, each will make its contribution to the 
integral. The largest contribution will be from the one where (f> is largest. 

A path of constant \Jr through z 0 may pass through a second saddle-point z v Then, if <j> 
has a maximum at z 0> it will have a minimum at z x and will increase again on the smooth 
continuation of the path past z x . Hence the line of steepest descent in such a case turns 
abruptly through a right angle at z 1 . The contribution from the neighbourhood of z x will 
be exponentially small compared with that from z 0 , but this case is interesting in relation 
to the Stokes phenomenon, which we shall examine later. 

We have assumed f"(z Q ) 4 = 0 . If the first non-vanishing derivative is/ (n) (z 0 ) (n > 2 ), three 
or more valleys meet at z 0 , and it will be necessary to examine which pair of them contains 
paths leading to the termini. The argument from Watson’s lemma needs straightforward 
modifications. 


17*05. Paths of constant (j>\ method of stationary phase. If the path AB of 
17-04(1) is one of constant <f>, and xl x I r> is °f bounded variation, we have seen that the 
integral is 0(e^jt), and an approximation can be found by using a path AA'B'B. If there 
is one saddle-point between A and B this argument fails, because if di/r/ds changes sign, so 
does dfijdn; then if <j> A > < <j> A , (J) B > < 0 B , A' and B' lie on opposite sides of AB, and 0 on A'B' 
cannot be uniformly less than on AB. We can, however, proceed as follows. Suppose that 
the path of constant <j> is along the axis of x increasing, so that the saddle-point is x 0 , 
i]r'(x 0 ) = 0, ijr"(x 0 ) 4= 0. Then on AB, near x 0 , 

f(z) = f(x 0 ) + $iTjr"(x 0 ) (x-x 0 ) 2 + 0(x- x 0 ) 3 , (1) 


whence, since/(z) is supposed analytic, 

/(z) = f(x 0 ) + W{x 0 ) (z - z 0 ) 2 + 0{z - z 0 ) 3 . (2) 

If */t”(x 0 ) > 0, a of 17-04 is + \tt\ if if"(x Q ) < 0, a = — \tt. Then the integral along a line of 
steepest descent through x 0 

= X(*o) J t j~^ rj~| ex P {^(*o) + itf&o) + i m sgn fM) + o{^ exp t<f>(x 0 )^ . (3) 

If there is a second saddle-point at x x within the range of integration, ^"{x^ will have 
the opposite sign to fr"{x 0 ); since <j> is constant the contributions will be of comparable 
magnitude. 

If xjr is not given to be the imaginary part of an analytic function, it is still possible to 
find an approximation to fe^ x(%) dx by methods similar to those used in proving Fourier’s 
theorem. In any interval where xN r ' is °f bounded variation it follows as before that 
1 = 0(1 H). If ft'(x Q ) - 0 , near x 0 we put, for i]f"(x Q ) > 0 , 

\Jr(x) - yjr(x 0 ) = = W(x Q ) (x - Xq) 2 . (4) 


Take u > 0 when x > x Q . 


we have 


If $ is fixed and positive, and 


= Jj* _ s exp it{i/r(x 0 ) + \u 2 }udu, 
f(x) = f(x 0 ) (x - x 0 ) = {r(Xo)Y h u. 


X(*) u 

f(x) 


— a 0 + 6u; a Q = 


X(*o) 


{p(*o)Y k ’ 


( 5 ) 

( 6 ) 


Then 


(7) 
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where 6u-> 

0 with u. If further 6 is of bounded variation in (— 8), 



rx.+» (\\ 

J 6uexpitu 2 du = , 

( 8 ) 


J X ° + exp itu 2 du~ J(~fj exp i 7 ™- 

(9) 

Then 

1 + °fc) ~ | fw |* J (t) exp +im}+ °(i) • 

( 10 ) 


If fr"(sc 0 ) < 0 , we must write — u 2 for u 2 in (4). The effect in (5) is to replace rjr f (x) by 
— ifr'ix) and u 2 by — u 2 , and in ( 10 ) we have — \ni for \ni. The result therefore agrees with 
(3) to this order with 0 = 0. 

The principle of the method is due to Stokes and Kelvin, who argued that in a wave 
problem the contributions from the parts of the range of integration near a point of 
stationary phase will be nearly in the same phase and add up, whereas those from other 
parts will interfere. It is not so easy, however, to find higher terms by this method. 


17*06. Stirling’s formula by steepest descents. The simplest application of 
Watson’s lemma is to the factorial function 


zl = J vfe-^du (91(2) > - 1 ). 


Put u = zv. Then (z z being interpreted as exp (2 log 2 ) with | arg 2 1 < n), 

/*00 

2 ! = 2 Z+1 J 


(1) 


( 2 ) 


exp {2 (log v — v)} dv. 

We shall take 2 = re id 

with 0 fixed, and attempt an approximation for given 0 with r large. Then ( 2 ) will be 
written 


zl 


" ^ 


exp [r{e <<? (log v — v)}] dv 


(3) 

(4) 


so that f(v) = e^(logv-v), /» = elj , f"(v) = . 

Then log v — v is analytic if we make a cut from 0 to — 00 . The only saddle-point is at v = 1, 
where/(v) = — e* 9 , f"(v) = — e ie . 

If v is real (> 0), 9 lf(v) = cos #(log v — v), which never exceeds its value at v = 1 provided 
cos# is positive. Hence the real axis lies in two valleys reaching 0 and 00 . %f{v) is not 
constant on the real axis except for 6 = 0. The direction,of the line of steepest descent 
through v = 1 is given by putting v— 1 = w, with —f{\)w 2 = e l9 w 2 real and positive, 
so that argw? = — \d or n— \Q. The sense to be taken is seen as follows. If we take S> 0 , 

v x = 1 — 8e~ ii9 , v 2 = 1 + 8e~ ii9 , 

a path consisting of straight lines from 0 to v lf 1 , v 2 , co in turn lies wholly in valleys except 
at the saddle-point, so that the direction is always from left to right through the saddle- 
point. Then for large r 

1-iW ( 5 ) 


2 !~ 2* !+1 exp( — re ie ) J(~rj 


= aJ(27t) z z+ * e -8 cos 0 > 0 , ( 6 ) 

where the value of 2 * with a positive real part is to be taken. This is the first term of 
Sti rlin g’s expansion. Note that if cos 6 > 0, 91 ( 2 ) > — 1 is satisfied, so that the latter 
condition becomes superfluous. 
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17*07. The Airy integral. Consider the integrals 


/(2) = 2 bj*~** dt (1) 

taken along any of the three paths shown in the figure. 
They converge exponentially provided that the real part 
of tf 3 tends to + oo at the termini; thus the three termini can 
in the first place be conveniently written + oo, oo exp (§ m), 
oo exp ( — §774). We have 

,72 I (• 

= Wi J efc-wv-zjd* 

= - 2 

= 0 , ( 2 ) 



since exp (tz — §i*) tends to 0 at both limits in each case. Hence these three integrals are 
solutions of the differential equation 


d*Z 
dz 2 


zZ = 0 . 


(3) 


Since this equation is of the second order it can have only two linearly independent solu¬ 
tions, and there must be a linear relation between the integrals. But if we take the integral 
around any contour in the positive sense around the origin it will vanish since there is no 
singularity of the integrand at a finite distance. Hence, with the senses indicated in the 
diagram, the sum of the three integrals is 0 . 

First take the path L x and define 

Ai ( 2 ) = 7 f e^-y^dt. (4) 

This is the Airy integral, one form of which was studied first by Airy in relation to dif¬ 
fraction near a caustic surface. It may be proved (cf. 1-123) that it still converges if 2 is 
real and L x is reduced to the imaginary axis. If we put t = is it reduces to 

Ai (z) = -^ f e^+ys&ds — - f cos (sz + §s 3 ) ds t (5) 

277-J -00 7r Jo 


which, apart from some constant factors, is the form used by Airy. 

Alternatively, take the integrals from 0 to 00, 00 exp §774, 00 exp (— §774) and denote 
them respectively by I v I 2 , J 3 . 


1 f a 


00 exp 


and 


.1 f 00 

-y^dt = 2^ exp §774J exp (uze^ ni — §w 3 ) du 
= exp (§774) I x (z exp §774), 

7 s (z) = exp (- §774) I x {z exp (- §774)}, 

Ai (z) =/ 2 (z)-l 3 (z). 


( 6 ) 

(7) 

( 8 ) 
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Also 

s<L. - - j ‘ (z)+/3(z) ’ isL. - - /2<z)+/i<z) - 

(9) 

The three solutions are thus expressed in terms of the single function I x (z ). Now 



j-^ 2 )=7- rv~*=r e-w ‘ s 

1 2m Jo 27nJ 0 r =o rl 



- 1 f%-*23 V Mzr-nzdv 

2m Jo rl 

(10) 


1 CO C^-kzY 

Hence Ai ( 2 ) = exp (f m) 3~% £ - (Jr-f)! exp §mr 


2m 

2 ni 


=0 rl 

1 00 (&l*zY 

- —.exp(-§ 7 tt)3-% ^1-^-(Jr-f)!exp(-fw) 


= i3 J '« £ J^(ir-f)!Bin{f»r(r + l)} 

7 T r=0 


= - 3~ % sin 

7 T 


+ - 3 - 1 / 8 sinf 7 r( 


*w(-f)t( 
-i)i( 


,+ 


2.3 2.3.5 .6 


z 8 \ 

+ 2.3.5.6.8.9 + ’‘7 


,10 


z + ^ + 


; + ; 


3.4 ' 3.4. 6 .7 3.4.6.7.9.10 


+ 


4 


( 11 ) 


This is the sum of two convergent series for all z and is real for real z. Each separately 
satisfies the differential equation (3). Denote the numerical coefficients outside the 
brackets by a and — /? and write 

Ai (z) = ccy x {z)-py^{z). ( 12 ) 


Using the relation 


z!( —z)! 


TTZ 

syiutz’ 


we find a = 3 -2/3 /( — J)!, /? = 3~ 1/s /( — f)!. 

For another solution of (3), real for real z, take 

= ^ 3-% S (Jr - f)! [2 - exp f (r +1) in - exp { - f(r + 1 )in}] 

= “ 2 (Jr - f)! (1 - cos f (r + 1) tt} 

= ocy x -\- aJ 3 


(13) 
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Reducing I 2 and I z to integrals along the imaginary axis we have also for real z 


1707 


Bi (z) = - I {e fe - x / 3 <*+sin (tz + |£ 3 )} dt. (14) 

The series expansions converge too slowly to be convenient for computation if | z | is 
more than about 3. We try the method of steepest descents. 

We therefore study the behaviour of the paths of steepest descent through the saddle- 
points when z has a general complex value. We write z — r exp (id), tz — ^£ 3 = f(t). Then the 
saddle-points are P x (t = t x — r 1/2 exp (\i0)) andP 2 (t = t 2 — — r 1/a exp (\id)). t x has a positive 
and t 2 a negative real part if — tt<0<it. We consider 0^ 0 . We have 

W(h) = $f(h) = -f^sinffl. 

The path of steepest descent S x through P x makes an angle tt — ^6 and the path of steepest 
descent S 2 through P 2 makes an angle \tt — \0 with the x axis. 

We consider first the approximation to Ai (z), for which the path is L x . There are the 
following cases: 

(i) 0 < 0 < \tt. $ 2 goes from oo exp (— §m) through P 2 to oo exp (fzr i), keeping P L always 
on the right; it has the form of L x and since f Hf(t 2 ) < 0 , we have that Ai (z) is exponentially 
decreasing. 

(ii) 6 = \n. The path S 2 still has the form of L x but dtf(t 2 ) = 0 and Ai(z) becomes 
oscillatory. 

(iii) \tt < 6 < f tt. The path S 2 goes from oo exp (— \in) through cuts the asymptote 
going to oo exp (%ni) and approaches it from above. It still keeps P x always on the right 
and is of the form of L x . dif(t z ) is positive and Ai (z) is exponentially increasing. 

(iv) 0 = § 7 r. The path S 2 is a straight line from ooexp (— § ni) through P 2 and P x to 
oo exp (\iri). A path L x follows this as far as P x and then turns through a right angle along 
8 X and goes to ooexp (%ni). $if(t x ) is negative and the contribution from the part of the 
path near P x is small compared with that from the neighbourhood of P 2 . Ai (z) is exponen¬ 
tially increasing. 

(v) § 7 r < 6 < 7T. The path S 2 now goes from oo exp (— %m) to oo and to complete a path 
equivalent to L x we have to add a path S x from oo through P x to ooexp ($Tri). The con¬ 
tribution from the neighbourhood of P x is exponentially decreasing while that from the 
neighbourhood of P 2 is exponentially increasing, and the latter determines the behaviour 
of Ai (z). 

(vi) 6 = n. The path L x is made up of the same two parts as in (v), but now 
fdf(t i) = 9l/(£ 2 ) = 0 and the contributions from the two saddle-points are of comparable 
magnitude. Ai (z) is oscillatory. 

The first few terms of the asymptotic expansions obtained in this way for —tt< arg z<n 
and argz = n are given in ( 20 ) and ( 22 ). 

The treatment of Bi (z) is a little more complicated because two paths, L 2 and L 3 , are 
involved. We have, however, putting t = r exp (— f ni) 


1 

2 ni 



Qte-y^dt = exp (— 
2m 



qtz exp (-%w 0 —Vst* fa 


= exp (— \m) Ai (z exp (— f ni)). 



e tz-y a t* 


exp (§7 ri) Ai (zexp (§m)). 


(15) 

(16) 


Similarly 
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Then (20) will yield a valid approximation for both (15) and (16) in the range 

— < arg z < \tt. 

In this range, therefore, we have the expansion given in (21). 

If this were valid outside the given range it would suggest that Bi (z) is exponentially 
small for some values of arg 2 . This is not so; in fact Bi ( 2 ) is exponentially large for all 
values of arg 2 , except for some that make it oscillatory. To see this and to obtain the 
asymptotic expansion for other values of arg 2 we make use of the identities 

Ai (ze^ akni ) = e 1/s&,ri |cos \hn Ai(z) — ~ sin \hn Bi ( 2 ) j, (17) 

Bi (ze^ akiri ) = e llakni ( — ^3 i sin \Tctt Ai ( 2 ) + cos \kn Bi ( 2 )), (18) 

— sin \hn Bi ( 2 ) = e 1/sfc7r * Ai (ze~^ ak7li ) — cos \kn Ai ( 2 ), (19) 

V 3 

where h is a positive or negative integer. 

From the last of these with k = ± 1 we have that, if | arg 2 1 = \ttottt, Bi ( 2 ) is oscillatory, 
and that otherwise one of the terms on the right is exponentially increasing. In particular 
if arg 2 = it and 2 = we have the expansions (22), (23). 

We now summarize the results obtained: 


M ( 2 ) ~ 27 " Z ‘ I, ‘ eXP ( " |2%) i 1 ~ iT48 2 %+1: ^4S 


5.11 _ 3 1.7.13.5.11.17 _ 

48 2 Z 3! 48 3 2 


for —ir<&rgz<n; 
Bi (z)~~ 2 ~ 1 / 4 ex] 

y7T 


8 1.5 3 , 1.7.5.11 . 1.7.13.5.11.17 „ 

( ^( 1 + n48^ + -TU8^^ + -3T48 3 — zJ(,+ 


3! 48 s 


for — £ 7 r < arg z < §7r. 

When arg z = n we put z = and obtain 


where 


Ai (z) = ^-Hm^(K Sh + irr)-Q(Ooo S (^ + i7r)}, 

Bi(2) = ±^ l ‘{P(Qcos(iC h + l”) + Q(Q'>nm % + iv)}- 

, 1.7.5.11. , 1.7.18.19.5.11.17.23_ . 

f(g )~ 1 ~ 2! 48* g + -4l48!- £ "• 

Q(Q 00 _ - 1 • 1 3 : 5 • 1 1 • 17 + . .. 

l!48 b 3!48 3 


The second terms are about a tenth of the first even at | 2 | = 1 . 

The particular functions Ai (z) and Bi ( 2 ) were chosen as the fundamental pair so that 
one would decrease exponentially along the positive real axis and so that on the negative 
real axis they would have similar amplitudes for large z but differ by \tt in phase. 

A linear combination of the series on the right in (20) and ( 21 ) is an asymptotic approxi¬ 
mation to some solution of the equation, but not to the same solution for all values of 
arg z. This phenomenon, discovered by Stokes, is known as the discontinuity of arbitrary 
constants in asymptotic approximations. It is an example of the theorem of 17*023 that 
an asymptotic expansion, if not convergent, cannot be valid for all values of arg z. 

17*08. Dispersion: wave - velocity and group - velocity . In a continuous dynamical 
system capable of propagating waves along the x axis, let wave-length 27r//c be associated 
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with period 27r/y. If the original disturbance £ 0 is unity for — h<x <h and otherwise 


zero, we can write it as H(x + h) — H(x - 


h), or in Fourier form 

3 . , dK 

sm Kh co&kx —. 

0 K 


( 1 ) 


The original rate of change can also be expressed as a Fourier integral; let us take it, 
however, to be zero. We then have the problem of waves spreading out from an initially 
disturbed region. For instance, the system may be a long canal and the original disturb¬ 
ance an elevation or depression of the water surface by some solid striking it. The elevation 
for a later time will then be 

( 2 ) 


= 2 -f 

wji 


• z. A k 

sm Kh cos kx cos yt —. 
0 K 


(Ik 

{sin (yt — kx + Kh) — sin (yt — kx — Kh) + sin (yt + kx 4- Kh) — sin (yt + kx — Kh )}—. 


We neglect possible complications due to reflexion at the ends, if any; that is, we treat 
the problem as determined entirely by the initial disturbance. The integrand in ( 2 ) can 
be broken up as follows: 

c.i-r 

b 2tt]o 

(3) 

If y was proportional to k we should have a system that propagates waves of all lengths 
with the same velocity, and then the first and second terms would represent waves 
travelling towards x positive, the third and fourth towards x negative. Those represented 
by the first and fourth terms would appear to have started from x = h, the others from 
x — —h. We are concerned here with cases where y is not proportional to /c; in other words 
the wave-velocity depends on the wave-length. We take y to be an odd function of k, real 
for k real. The treatments of the four terms are all similar, and we may confine attention 
to the first. We then take 

i r°° (he 

^ __ _ ; J ^Jiyt—KX+Kh) _g— i(yt—KX+Kh )j_ 

.d/c 


1 f°° 

-j s p J-, 


gUyt—KX+Kh) _ 


(4) 


Here h appears only in the combination x — h\ we may therefore omit it, as if the waves 
started at x = 0 , and restore it later if required. We can also temporarily omit the suffix 
in In the applications x and t are both large; but we can find an approximation for 
large t with xjt fixed. Then /( K ) = i( y -**/*) ( 5 ) 

and at a saddle-point /'(k) = i(y' — xjt) = 0; f"(K) — iy", ( 6 ) 

accents denoting differentiation with regard to k. It thus appears that a fundamental 
part in the method is played by dyjdK; this is called the group-velocity. The wave-velocity, 
at which all waves would travel if only one wave-length was present, is y//c; but we see 
that the Fourier representation of a local disturbance automatically introduces all 
possible wave-lengths, and it remains to be seen whether the wave-velocity reappears 
explicitly. We shall find that it does. The relation between them can be written, if we put 


y/K = c, dyjdK = C, 


„ d , . dc ~dc 

C = —(kc) - K—j (-c = c — A, 
dK dK dA 


(7) 


if we introduce the wave-length A = 2i t/k instead of k\ but the results are far more easily 
stated in terms of k. (5) is an equation determining k as a function of xjt ; denote this 
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value by k 0 . We assume at present that it is real. Since y' is an even function of k, if k 0 
is one (positive) root — k 0 is another. We also assume that y is not infinite for any real k, 
so that the integrand in (4) has no singularities on the real axis except the pole at k = 0, 
which is irrelevant because the other three parts of (3) will also have poles at 0 with residues 
+ 1 , and their effects will all cancel when they are combined, whatever path of integration 
we choose. 

Then/"(Xo) is purely imaginary, and y" will have opposite signs at k 0 and - k 0 . If y" is 
positive at /c 0 , the path of steepest descent at k 0 will cross the real axis in the direction \tt, 
that at —k 0 in the direction — \n. If y” is negative these relations will be reversed. Then 
the first term in the asymptotic expansion of the contribution from the passage through 
Kq, in the first case, is 

1 V(2 7T) exp i(y 0 t-K 0 x) . 

4ni Kq {ty"(K 0 )} l l* 

and the passage through — k 0 gives 

_2_ V( 2?r ) exp{ —t(y 0 <-* 0 3;)} , Uni 
"4 m k 0 mKo)Y k ’ [) 

on attending to the reversals of sign but remembering that it is | y' r ( — k 0 ) | that appears in 
the denominator, the argument being looked after by the last factor. Combining the 
two we have 

^ V ( ^ vW ^ 8in(yot ~ yoa:+i7r) - (10) 

If yo is negative we still take | y£ | in the denominator but reverse the signs of the 
exponents ± \ni. Then in both cases 


V( 2?r WI I *0 


sin (y 0 * - KqX - £?rsgny"(/c 0 )). 


There is therefore a difference of phase of \n according as the group-velocity increases 
or decreases with k. 

Now consider the angle 6 = y 0 t — K 0 x, (12) 

and see how it varies with x and t, remembering that k 0 and therefore y 0 are functions of 
xjt determined by (5). 

d l = &^-K -X^. (131 

dx dK 0 dx 0 dx‘ ' 


d 30 

But t — x = 0 by (6); hence the terms in Bkq/Bx cancel, and — = — /c 0 . Hence the phase 
cLKq ox 

varies by 2n in a distance 27t/k 0 ; and therefore 2 7t/k 0 is the wave-length of waves passing a 

point at distance x near time t. Similarly 

dd dy 0 Bk 0 Skq 

Si = ^ +t d^-¥~ X li = r »- (14 > 

Hence 2zr/y 0 is the period of waves passing a point at distance x near time t. Then y 0 //c 0 is the 
velocity of waves passing a point at distance x at time t, in the sense that if we move forward 
with this velocity we shall keep in the same position relative to the nearest crest and 


JUP 


33 
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trough. Thus the velocity of an individual wave is the wave-velocity appropriate to its 
length. But we cannot infer from this that each wave travels with a constant velocity. 
For if y is not proportional to k, C is not equal to y//c; therefore if we take a point travelling 
outwards with velocity c it will come to a place where the ratio xjt is different from what it 
was at the place and time when we started, G will be different and therefore the local values 
of y, k, and c will also be different. We can think of an individual wave, and it travels with 
velocity c appropriate to its length at the moment; but as the wave goes on its period, 
length, and velocity will all change. If on the other hand we travel out so as to keep xjt 
the same, we always keep to the same value of G, and therefore of k, y and c; but we do 
not keep to the same wave because c=}= G. That is, periods, wave-lengths and wave-velocities 
are propagated with the group-velocity, individual waves travel with the local wave-velocity , 
but change their periods , lengths , and velocities as they travel. 

Energy also is propagated with the group-velocity, in a certain sense. Let us consider 
two points starting at x = 0 at time 0 and moving with velocities G x , C 2 . The energy 
corresponding to the displacements between these points at time t can be taken to be 
proportional to 

&dx~\ sin 2 (y Q t-K 0 x±l7r) dx, 

J c,t o) K o 


(15) 


provided that there are several waves between x = C x t and C 2 t. But then we can take a 
mean value of sin 2 (y Q t— k 0 x± Jtt) over each wave, since the individual waves are nearly 
harmonic; and then this expression is nearly 


/*cv 

JCtt 


Cxt tortylKl 


dx 


T 

JCt 


dG 




(16) 


on putting x = Gt. But y 0 " and are functions of G alone; hence the energy between two 
points starting at the origin and moving out with constant speeds is independent of the time ; 
and these speeds are the local group-velocities. 

All these results are approximations subject to the condition that we can safely reduce 
the asymptotic expansion of £ to its first term. This is usually satisfied; we shall see later 
that if it is not the local motion is not even approximately simple harmonic in the neigh¬ 
bourhood. Since we are effectively expanding in negative powers of t the approximation 
will always improve as the wave train advances. 

In (15), the wave-lengths between the points x = C x t and C 2 t remain constant; but the 
distance increases; hence the number of waves increases in proportion to t. The original 
disturbance may at first give a solitary wave, but as it travels it develops into a train, 
which becomes longer and longer. 

There may be no real value of k that satisfies ( 6 ) for some values of xjt. In that case the 
best procedure is to seek for saddle-points off the real axis. Then the exponent in (4) will 
have a real part increasing numerically with the time. In a physical system, if this was 
positive, we should ultimately have a steadily increasing energy. Since this is impossible 
the real part of the exponent must be negative. Hence complex roots of ( 6 ) correspond 
to places where there is little disturbance and what there is is not approximately simple- 
harmonic. There may be a minimum or a maximum real group-velocity; in the former case 
there will be little movement within a certain distance of the origin, increasing with the 
time, in the latter beyond a certain distance. Since a minimum or a maximum group- 
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velocity implies the vanishing of y", and therefore of the denominator of our approximate 
solution, we can infer that near the front or rear of such a train the amplitudes will be 
large; but if we are too near it we can no longer use the method in its present form, since 
it depends essentially on y^t being large enough to overwhelm later terms of the series. 
Further consideration of this point requires attention to the cubic terms in the exponent, 
which introduce the Airy integral. 

17-09. Dispersion of water waves. All the features described in 17-07 and 17-08 are 
exemplified in the theory of water waves when capillarity and gravity are both taken 
into account. The wave-velocity is given as a function of k by 


= (f + r*) 


tanh kH, 


( 1 ) 


where the surface tension is T'p, p being the density, and H is the depth of the water. If 

we choose the solution that reduces to (gH) 1 ^ when k is small, c is a single-valued function 
of k. Then 

7 2 = (gK + T'k 3 ) tanh kH, (2) 

and y behaves like (gHfb K for *■ small and like (TV 3 ) 1 * for a: large. The group-velocity G 

therefore tends to {gBf h for a: small and to +co for k large. Taking the second approxi- 
mation for k small we have 




( 3 ) 


If then T’jg < C will be less than (gHfl* when k is small. This will be satisfied if the 
depth is more than about 0-5 cm.; and for smaller depths it would be necessary to take 
viscosity into account. We shall assume that H is considerably more than 1 cm. Then as 
* mcreases from °> 0 begins by decreasing; but it increases again for k large and therefore 

as a minimum for an intermediate value of k. The existence of a minimum wave-velocity 
for water waves is well known, that of a minimum group-velocity less so, but it has a 
considerable influence on their dispersion. We find that at a given time there is smooth 
water near the origin, up to the distance that can have been reached by waves travelling 
with the minimum group velocity. Further out any given distance wifi be associated 
with a group-velocity, but this will be associated with two possible wave-lengths 
the shorter controlled mainly by capillarity and the longer by gravity. Beyond a 

distance (gHfln there will be no gravity waves, but the capillary waves will still 
be possible. 

If the depth is large we can take tanh kH = 1 for all but the longest waves; then 


y = {gK+ T'k 3 ) 1 /*, 

c _ gr + 3TV- 
2 (gK+ T'k 3 ) 1 !** 


dC 


3 T'k 


(g+3T f K 2 )* 


dK (gK + T'k 3 ) 1 !* + T'k 3 ) 


( 4 ) 

( 6 ) 

( 6 ) 


33‘2 
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and the minimum group-velocity corresponds to 



If we introduce the minimum wave-velocity c 0 given by 

eg - 2(gT')V>, 

and the corresponding wave-length given by 

*o T' = g, 

we have for minimum group-velocity 

k = 0-393/e 0 , G = 0-767c 0 , c = l-212c 0 , 


d*G 
dtc 2 


= 0-371^. 


1709 

( 7 ) 

( 8 ) 

( 9 ) 

( 10 ) 

( 11 ) 


On water the least wave-velocity is 23 cm./sec., and the corresponding wave-length 
1*8 cm. Hence the minimum group-velocity is 18 cm./sec., the corresponding wave- 
velocity 28 cm./sec., and the wave-length 4-6 cm. The approximation tanh kH = 1 is 
therefore justified for waves with the minimum group-velocity or shorter waves provided 
that the depth is more than about 5 cm. 

We shall take first the case of very deep water and waves such that x/t is small compared 
with (gHfh but large compared with the minimum group-velocity. Then we can write 

y=(gK)% c=(g/K)\ G = U9l K f k * dC/dK = - l(^//c 3 ) 1/a , (12) 


and from 17-08 

where 

Thus 


(11) (/c 0 defined as in 17-08) 
2 




(13) 

(14) 

(15) 


and the amplitude decreases towards the rear of the train like A 

This result, however, is modified if we return to 17-08 (3); for the train that we have esti¬ 
mated is a single one, supposed to have started from x = 0. There will actually be two 
superposed trains, one having started from x = h and the other from x = - h. If the wave¬ 
length exceeds 2 h, therefore, we should restore these, and then the full solution will be 
approximately —2 hd jdx of what we have just found. It will therefore increase towards the 
rear of the train like x~* h . The behaviour of the train therefore depends greatly on the 
form of the initial disturbance. The splash of a brick, for instance, will give 2 h about 10 cm. 
Waves shorter than this will have their smaller amplitudes towards the rear of the train, 
longer ones their larger amplitudes. Hence there will be a wave-length associated with a 
mn.Yiirmm amplitude and determined mostly by the scale of the initial disturbance. But 
a rain-drop or a rising fish gives an initial disturbance with a scale comparable with 1 cm., 
and the amplitude of the gravity waves will increase all the way to the rear of the train. 




17*09 Capillary waves ; long waves 

Next, take capillary waves. We have now 

y = ( T'k c = (T'k)\ C = dC/dic = &T'Ik)\ 


sin (7o< K °* +i,r) ’ 


where 


a: 


f(7 1, ^ 0 ) 1/a = 7* 
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(16) 

(17) 

(18) 


The amphtude therefore is roughly proportional to x~ 9k at a given time. Thus even without 
the effect of viscosity, which is considerable for these short waves, their amplitudes will 
fall off rapidly towards the front. 

Two exceptional cases arise, namely for group-velocities near ( gH ) lf2 and those near the 
minimum. The standard formula found in the method of steepest descents depends on f"(z) 
being large enough for us to be able to replace exp tf(z) by exp tf(z 0 ) exp {ffi"{z 0 ) (z-z 0 ) 2 } 
until it is small compared with its value at the saddle-point. This will be possible for t 
large enough if/"(z 0 ) is not zero. If f"(z 0 ) = 0, the behaviour of the integrand will depend 
on the terms in f"'(z 0 ) (z — z 0 ) 3 , and the method is not applicable in its simplest form. Near 
a value of z that gives /' (z) = 0 the approximation will not be good unless t is much larger 
than would suffice to give a good approximation elsewhere. We can, however, obtain useful 
solutions in terms of the Airy integral. 

Take first the longest waves. Since the horizontal extent is then of dominating import¬ 
ance we allow for the rear wave by applying the operator — 2hdjdx to 17*08 (4); then from 
17*09(3) 

7 h r°° 

£ " e *yt- KX ) dK 

2n J 


Put 


Then 


^ cos {(gH) l kKt(\— ±k 2 H 2 ) — kx\ dK 

= ^ j* cos {/ca; - (gr //) 1/2 tct + tH 2 k?) dK. 

2{x — cd) z 


(gHfl* = a, 


z 3 = 


<xtH 2 


K = 


x — at' 


7T(X 


hz f° 
v-cd)] 0 


COB (« + W* = . 


(19) 

( 20 ) 


( 21 ) 


At points where x = at, the amphtude decreases with time like £- 1/a instead of as at 
places where the group-velocity behaves ordinarily. The front of the gravity wave 
therefore becomes more and more the most prominent feature of the disturbance. The 
Airy integral decreases towards positive argument, and the disturbance at values of x 
greater than at falls off rapidly and smoothly. The maximum elevation is a little behind 
the place where x = at, and is followed by a train of waves becoming smaller and shorter. 
If xjt is near the minimum group-velocity and h much less than the corresponding wave- 

length ’ hr. hr 

e*W**)dK = -J cos (yt — kx) dK. (22) 

Let suffix m indicate values corresponding to the stationary group-velocity. For the 
positive value K m , put 


k = K m + K 1 , 


y • c m + C m /Ci+6<7 m /cf. 


(23) 
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Then yt-™^K m {c m t-x) + K x {C m t-x)+\4C n m t. • (24) 

The saddle-points are near k x = ±{2(z/f-(7 m )/C4}-* Suppose that x[t is such that this 
is small, and take a circle | /q | = constant such that the saddle-points lie within it. The 
error in neglecting the parts of the path of integration outside the circle is 0(1/*), and we 
can deform the path inside the circle so that it passes through k x = 0. Then 

ib f 00 

2^ ° OS * m(Cm * “ J _ ro C ° S * “ *) + O} dK x 

h f 00 

- 2 ^ sin K m(c m ^-^)J_ oo 8 in [K x {C m t-x)+ C" m *} d,K x . (25) 

The second integral is zero; and 

2A (<^) /8cos/f m(c m <-»)Ai|^^^ l3 (C m t-x) J. (26) 

The errors due to neglecting C'^ and within an arc of k x such that AcfC^* has become 
large are easily shown to be of the order of *~ 2/3 of the main term; hence the error of (26) is 
0(1/*). For large * the last factor varies much more slowly with x than the cosine, on account 
of the small factor *~ 1/8 in its argument. Hence the variation can be described as a series of 
waves of length 27 T/K m and period 27r//c m c m , but with the amplitude falling off exponentially 
for x<C m t and oscillating slowly for x>C m t. The most conspicuous feature is the ring of 
waves with the length corresponding to the minimum group-velocity, surrounding a 
circle of smooth water. 


17*10. Interrupted harmonic wave train. In the direct measurement of the 
velocity of light or sound a continuous train of waves, which can be treated as harmonic, 
is interrupted at regular intervals by a toothed wheel or a revolving mirror, giving a series 
of flashes. What is observed is the time such that the flash returning after reflexion at a 
distance is blocked by the revolving mechanism near the eye or ear. Thus the experiment 
does not depend directly on the time of travel of individual waves, but on that of variations 
of intensity. The simplest statement of the phenomenon would be to regard the train as 
having beats, so that the disturbance can be expressed in the form 

£= cosy 0 *cos£7r*/& = £ cos \y 0 -~jt + i cos ^y 0 + ^ f. ( 1 ) 

The disturbance has period 27r/y 0 but its amplitude vanishes whenever * is an odd multiple 
of k. In the next beat the phase is reversed. Now suppose that the wave-length 27 t//c 0 
for period 27r/y 0 is given by k 0 = y 0 / c o an< i for neighbouring periods by 

a k 

K o + dj, d V = K o + ( 1 l c )ty- 
Then the disturbance at distance x is 


( = i cos I (y»-|fc) * - (*-sb) *) ■+i cos ((r«+3 *- (*+sgj) *) 


= cos (y 0 * — Kqx) cos 


IT 

2k 



(2) 
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Since y Q — k 0 c q , the first factor shows that the phases travel with the wave-velocity. But 
the second factor shows that the beats travel with the group-velocity dy/d/c = C. Hence 
experiments of the type in question determine the group-velocity associated with period 
27r/y 0 , not the wave-velocity. 

17*11. Refraction of a pulse. Suppose that there is an instantaneous disturbance 
at a point O within one medium and that it is partly transmitted into a second medium 
where the velocity of propagation is different. Both media are dispersive, that is, the 
velocity depends on the wave-length. Suppose that the disturbance in the second medium 
is observed at a point P; this may be regarded as the resultant of a set of partial disturb¬ 
ances coming by way of various points Q on the interface. Then the time factor in the 

disturbance at P is ., . . ... 

exp i(yt — k^ — k 2 s 2 ), (1) 

where and s 2 are the distances of Q from O and P, and k x and k 2 correspond to y in the 
two media. Let Q be at a distance s measured along the boundary from some fixed point. 
Then the chief part of the disturbance at P, since y and s are both variables, will be found 
by making this factor stationary with regard to both, t and the position of P being kept 
fixed. This gives 

ds x . d/St 


= 0 - 


t—s 


d/c t d/c 2 


1 dy S *dy 


= 0. 


( 2 ) 


(3) 


Introducing the wave-velocity and the group-velocity for each medium we see that 
these are equivalent to 

1 1 fist- 

= 0, (4) 


i ds-y ^ i ds 2 


Cj ds Cg ds 

3 + 3 


(5) 


The first of these is simply the usual law of refraction; the effective values of s are around 
the value that makes the time from 0 to P stationary for a point travelling with the wave- 
velocity. Thus the directions of travel of the waves are determined by the wave-velocity. 
The second relation, however, shows that the dominant period is determined by the group- 
velocities; for this is the equation corresponding to xft = G determining the dominant 
period in simple dispersion. Thus refraction in dispersive media is rather complicated, 
the lines of constant phase being in general inclined to those of constant period.* 

17*12. Asymptotic solutions of differential equations. These are of several 
types. If 


2i +f(x) t +g{x),j=o 


(i) 


and/(a;), g{x) are analytic and bounded as x tends to infinity, we can write the equation as 

i + (o„+^ + ...)^+(6 0 +^ + ...) 2 , = °. (2) 


<Py 

dx 2 


* R. Stoneley, Proc. Gamb. Phil. Soc. 31, 1935, 360-7. 
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Solutions of Stokes's type 

1712 

The functions 

z x = eV^i, z 2 = e x i x x?* 

(3) 

satisfy the differential equation 

z"~ I 

(a i + A 2+ s +* 2{ A 2 -A l + K-<r 1 )/J’ 


+ 

[( Al + J ( A * + J ^ _ Ai + (<ri _ ^ J * - 0. 

(4) 

The coefficients agree to terms in 1 jx with those in (2) provided that 



Ai "H A 2 <Zq, Aj A 2 — 6q, A 2 Aj 4 s 0, 

(5) 


0”i4-(7'2 = U x , Aj(T 2 -j-A 2 CTj = b v 

(6) 

The first two of these equations determine A l5 A 2 , which will be different unless b 0 

= K 


Except in this case cr 1 and <r 2 are determinate. Convergent solutions can then be deter¬ 
mined by the method of variation of parameters, but contain incomplete factorial func¬ 
tions. Substituting asymptotic expansions for these we get forms for y x , y 2 such that 
y 1 lz 1 and y 2 /z 2 are given asymptotically by series in descending powers of x. 

The result was given by E. L. Ince,* by a rather different method. He did not attend 
explicitly to the case where the method leads to X 1 = A 2 . The solution then takes a different 
form. We can suppose a factor e Xx removed from the solutions, so that in (2) a 0 and b 0 are 
zero. In this case the functions 


z v z 2 = e ± / ixi x cr 

(V 

satofy + 

(8) 

which agrees with (2) to terms in 1/a; if 


2(<r-i) = -a 1} fai 2 = - b v 

(9) 


There are solutions of (2) asymptotically equal to z t , z 2 multiplied by series in descending 
powers of xi, provided b x 4 = 0. 

If further b x = 0, (2) has a regular singularity at infinity, and there are two convergent 
series solutions unless the indices are equal or differ by an integer, when one solution may 
contain a logarithm. 

Hence if f(x), g(x) tend to a 0 , 6 0 as x tends ta infinity, we can in general substitute for 
y a series 

e Xx otf 

and determine A, <r and the coefficients in turn, and the series so obtained will either 
converge or be asymptotic to two solutions of the differential equation. In the exceptional 
case where the equation for A has equal roots, there are solutions of the form 

e± MV(i+g+^+...) (11) 

with similar properties. We shall refer to such solutions (i.e., in descending powers of x) 
as of Stokes s type, though the first expansions of this type seem to have been given by 
Jacobi for the Bessel functions. 




( 10 ) 


* Ordinary Differential Equations, 1927, 169-71. 
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Stokes’s solutions of Bessel’s equation 

17*121. Write Bessel’s equation in the form 

1 dy L 

dx 2 +xdx + \ ^) y_0. 

This is in the form required, with A — ±i. Put 

y = e ix u. 


Then u" + ^2i + ^ju' + — u = 0. 

Now put u = afv, 

and choose <7 so that the term in vjx is zero. We find that ar = —%, and 

v” + 2iv' = 0. 


Substitute 

we find the recurrence relation 



Hence 


{r(r +1) - (n 2 - £)} a r = 2«(r + 1) a r+1 . 

1 , , (i-^ 2 )(l~^ 2 ) , (j~n 2 ){^-n 2 ){^--n 2 ) 

~ 2ix (2 i) 2 2\x 2 (2i) 3 3! x 3 + 

= 1 ( i- n2 )(i ~ n2 ) 1 (j-n 2 )(z-n 2 )(ir-n 2 )(4£-n 2 ) 

2! (2a:) 2 4! (2a;) 4 

,( |-n 2 (j-n 2 )(f-n 2 )(y-n 2 ) 1 

\ 2x 3! (2a:) 3 J 

= U — iV say. 


Thus we have found an asymptotic solution 

y 1 r^>x~~ ll<i e ix ( U — iV) = a;~ 1/2 ( U cos x + V sin x) + ix~ 1 k( U sin x — V cos x). 


The real and imaginary parts separately will be asymptotic solutions. 

The choice of coefficients to make the solution correspond to J n {x) can be made by 
considering the first term of the asymptotic expansion found from the complex integral 
solution. We shall postpone this until we consider the most convenient companion 
function to J n (x). 

If n is small and x large the terms begin by decreasing rapidly. Thus for n = 0 and 
x = 10 the term in x~ 2 is 9 9 

16.2.(20) 2== l2800 


of the first. But the terms of the ascending power series do not begin to decrease till the 
fifth, and many more are needed to give an accuracy of 1 in 1000. The asymptotic series 
is therefore of great use. 

It is not useful unless 2x is considerably greater than n 2 . 

If n is half an odd integer both series terminate and the solutions are expressed in 
finite terms. In particular if n = U = 1, V = 0, and a pair of solutions is a;- 1 / 2 cos x, 
X -V 2 sin x. 



522 


17 * 122-17123 


Another form of solution 

17*122. Sometimes f(x) and g(x) are not conveniently expressed in power series. The 
following method is then useful. Put 

y = uv, (1) 

and determine v so that the coefficient of u' is 0. This gives 

V 

(2) 


+/ = 0, 


v = exp 


the constant factor being irrelevant. Then the equation for u takes the form 

u" = x(z)u. 

We assume x( x ) large but x'lx small in the interval considered. Put 


u = exp 


Then 

and 


LfH- 


(3) 


(4) 


( 5 ) 


^ 2 + V=X(«)* (6) 

This equation is remarkably simple in appearance, but is non-linear. However, if xi x ) 
varies slowly we can take, approximately, 

V *.^(*); (7) 

and in the second approximation rf will be small. Then 


V*)P(= 

U == x~ lk exp 


( 8 ) 


The easiest way of examining the accuracy is to attempt a third approximation; redeter¬ 
mining ij' from (8) and substituting in (6) we find 


_LAx' 

l A 4 X 8 x lk d x x 


(9) 


If A is the smallest value of in the range considered the integral of the last term wall be 
° f0rder 8l[9 . If therefore x'lx varies by a moderate factor in the range an appreciable 
error will accumulate unless x is everywhere large.* 

17*123. This method can frequently be used to suggest the first term in an asymptotic 
expansion. If we take the equation for the Airy integral 

d?y 


* Cf. W. E. Milne, Trans. Amer. Math. Soc. 31, 1929, 907-18; E. C. Titchmarsh, Joum. Lond. 
Math. Soc. 19, 1945, 66-8. 
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J 'X 

and there are asymptotic solutions beginning with 

x- 11 * exp (+ fa: s/2 ). 

The exponent is not a multiple of x, so that the solution is not of the form considered in 
17* 12; but we could transform to as a new variable and proceed. 


17*13. A somewhat similar solution can be obtained if x( x ) depends both on x and on 
an additional parameter h , and 

y* = x(v) y = (h 2 Xo + h Xi+x*)y> (i) 

where h is large. For any h the equation will have a pair of solutions, valid in any range 
and not only when x is large. We want to know how these solutions behave when h is large. 
We therefore want an expansion in descending powers of h. Put 


*“(s) 

the differential equation becomes 

= (**Xo + hXi + Xz + i^,~\ |i) 2. 

Take 


^ = /o(* 0 + X + ^ 2 ) dx > 


( 2 ) 


(3) 


( 4 ) 


where ft 2 is to some extent at our disposal. Then the differential equation takes the form 

. d*z 


dE 2 


h 2 z = g(g,h)z ; 


(5) 


g is expansible in descending powers of h. Substituting 


z — e^(l 

\ n h 2 



( 6 ) 


and equating coefficients ofh?,h, 1, ...in turn, we can find the functions/ l5 / 2 ,... one by one. 
Similarly we find a formal solution starting with exp( — h£). There are advantages in 
choosing rjr 2 so that | g{£, h) | will be as small as possible. 

In practice we are usually concerned with only the first approximation. We take h real 
and positive. Even for this the proof that the solutions are asymptotic requires restrictions 
on #(£> h) and on the region. In particular the region may be unbounded and we want 
conditions that there may be solutions Z v Z 2 satisfying 

z i ex P (- H) = 1 + MJh; Z 2 exp (ft£) = 1 + M 2 Jh, (7) 

where M x , M 2 are bounded for all £ of the region and for all h > h 0 > 0. Suitable conditions 
on the region, E, are (1) E contains at least one point of the real axis (2) if £ is in E, then 
all points rj = ffi(£) + i03(£)» 1, are also in E (and consequently the upper and 

lower bounds of 9t(£), which may be infinite, are taken for points on the real axis); and on 
g(£y (3) that on any paths in E along the real or parallel to the imaginary axis, 
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Comparison of types of solution 


1713 



< M, with M independent of g l3 g 2 and h for h ^ h 0 . 


that the solutions are solutions of the integral equations 


The method is to notice 


^1 (£) = eH + “ e- k< £-Q} g(t, h) Z^{t}dt, (8) 

Z*(Z) = z~ H - J 7 ^ ~ g{t,h)Z 2 {t)dt, (9) 

where A, B are the lower and upper bounds of 91(g), and the path in ( 8 ) is from A to $ft(£) 
along the real axis and from 91(g) to g parallel to the imaginary axis; for ( 9 ) it is similarly 
from £ to 91(g) and then to B. It can be shown that substitution of successive approxima¬ 
tions gives corrections & r (£, h) times the first term, such that | lc r | < ( 2 Mjh) r . Thus the 
method leads to absolutely convergent solutions, and the error of the first term is uniformly 
less than 4 M(h of the first term for h ^ 4 M. Attention to the form of Jc r shows that it has 
an asymptotic expansion in powers of Ijh, and Z x , Z 2 can be rearranged in the form ( 6 ). 

Condition ( 2 ) is important because the proof assumes that in ( 8 ) the upper hound on 

the path of | e m \, and in (9) that of | e~ 2M \, are taken at t = g. This would not be true if 

the condition was not satisfied. If possible should be chosen so that condition (3) is 
satisfied. For instance, if 

*/" = (/i 2 + a)y, (10) 


exact solutions are exp {±{h 2 + a ) 1/a *}. If in (4) we took fjr 2 = a, the first terms of the 
approximate solutions are identical with the exact solutions. If we took = 0 the first 
terms would be exp ( + hx). The ratios of these to the exact solutions are approximately 
exp (± ax[2h), in which the indices are unbounded in an infinite interval of x. This is 
a case where it is best to take \jr 2 = y 2 . 

On the other hand take the equation, suggested originally by Prof. G. H. Hardy, 


/-(W+#)y = 0. 

Exact solutions are 

y x = exp&$; y 2 = 2he he f e~ 2h6 ®dt 

J X 

CXP (~ h0 ) + 0 (p) * 

on integration by parts. 

If we take 6 = (logic ) 2 and we find 


= 2 h 


logic 

X 


1 logic— 1 
4 h (logic ) 2 


1 (logic l ) 2 
32 h 2 (log ic ) 2 


The first two terms lead to 

y x = exp^(logic) 2 , 


( 11 ) 


( 12 ) 


(13) 

(14) 


V* = l^ eXp ^" A(Iog;K)2 ^ 


(15) 


which agree with the first terms of the exact solutions. But the third term introduces 

faCt ° rS 3 ±Qog*—4)/32ft(l og *)—lA6fo, ( 16 ) 
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which vary by indefinitely large factors in an infinite range of x\ that is, the approxi¬ 
mation given by the first three terms is not uniform. Closer investigation of (5) shows that 

p oo 

taking = y 2 makes g{£, h) d£ diverge. But if we take simply 

er 


= M' + 


20 ' 


corresponding to 


^2 = 


Q»Z 

W 2 ’ 


(17) 

(18) 


the integral converges absolutely and uniformly, and the approximation does not contain 
the objectionable factor introduced by the third term of (13). 

Approximations of this type have a long history. The first with an application to a 
general seems to have been given by G. Green,* who used an equivalent process in 
showing that for tidal waves in a canal of slowly varying section the energy is transmitted 
without loss by reflexion. We shall therefore describe them as of Green’s type. The transi¬ 
tion from physical optics, depending on a second-order partial differential equation in 
three dimensions, to geometrical optics, depending on a first-order one, really involves 
the same principle. A special application to Bessel functions had been given still earlier, 
by G. Carlini in 1817.f There are numerous applications in wave mechanics, the chief 
perhaps being in the proof that classical mechanics is the limiting case of quantum 
mechanics when the energy is large. The present form is due to Jeffreys.^ 

17*131. If Xo has a zero in the region, £' defined by 17*13 (4) tends to 0 there as h-+co 
and the approximation fails. For a simple zero this can be treated by choosing £ so that 
the differential equation reduces approximately to 


d 2 z 

w = hiz 


(i) 


solutions of which are z x = Bi(A*£), z z = Ai (A % £). Let x = Xo+Xilh + ftJh 2 vanish at 
x = — a (where a will be O(ljh)) and take Xo(0) > 0. We can take 




( 2 ) 


and the transformation is non-singular at £ = 0. Then the differential equation takes the 
form d2z 


d£ 


: 2 — h 2 £z = g(£,h) z, 


( 3 ) 


and solutions can be developed from integral equations as for the simple case of 17*13, 
the successive corrections again decreasing as fast as the terms of a geometrical progression 
in Ijh. Conditions are needed on the region of validity E. The most useful set appears to 
be that E includes a segment of the real axis from 0 to B and segments of straight lines 
from 0 to B 1 e 2/s7ri and from 0 to J5 2 e _2/3?ri ; that any £ not on these lines can be connected 
to a point £ 0 on one of these segments by an arc p s/a cos § 6 = constant, where t = pe ie ; 
the path of integration consists of segments of these three lines together with one of these 

* Camb. Phil. Trans. 6, 1837, 457-62; Papers, p. 225. Cf. Lamb, Hydrodynamics, 1932, p. 274. 

f Watson, Theory of Bessel Functions, p. 6. 

% Proc. Lond. Math. Soc. (2) 23, 1924, 428-36. Other references in Jeffreys, Phil. Mag. (7) 33, 1942, 
451-6. 
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arcs; on any of the paths 0 to B, 0 to £ 0 and £ 0 to £ the integral J| tr'hgtf, h) 11 dt | is uni¬ 
formly bounded.* 

When | h^E, | is large we can use the first terms of the asymptotic approximations to 
Ai and Bi, with further errors of order 1 /A£ % . 

We use Ai and Bi to connect solutions of the forms of those of 17*13 valid on opposite 


sides of £ = 0. We have 

y = f'- 1/2 z = x~ lk £ kz ' (4) 

For £> 0 put M = I hffidx = §A£ % ; a pair of solutions are 
J —a 

2 h = V 77 h Xk i'~ Xk Bi {h^E) {1 + 0(1 fhj} = x~ y * exp M {1 + 0(1 /h)}, (5) 

y 2 = <Jn Ai (/& % £) {1 + 0(1/^)} = fcjr^exp(-Af){l + 0(l/A)}. (6) 

For £< 0, put £ = — £, L = f h( — xf k d( — x); then 

J a 

2/1=1% l -1/4 {cos {L + In) +0(l/h)}, (7) 

2/2 = IX |~ 1/4 {sin (Ir + i7r) + 0(l/A)}, (8) 

where x can be replaced by y 0 with a further error of order 1 fh. 


It has been pointed out by Langerf that care is needed in the use of (5), (6), (7) and (8) 
to establish correspondences between solutions on opposite sides of a zero of ^ 0 . We have, 
if A and B are constants, a general solution Ay x + By 2 , with asymptotic expressions 

Axr lk exp if {i + O(ljh)} + \Bx~ x,i exp (- Af) {1 + 0( 1 /h)}, (9) 

A | x | “ 1/4 (°os {L + In) + 0( 1 jh)} + B | x \ ~ Vi {sin {L + in) + 0( 1 / h )}. (10) 

If the solution for £ > 0 is exponentially small, it follows that A — 0 and hence the solution 
must be, to 0{ 1 jh), equal to \Bx~ 1!i exp (— M) and B | x | ~ 1/4 sin (L + in) on the respective 
sides. If, however, A is not known to be zero but merely to be of order Bfh , it will not 
affect the validity of the approximation on the oscillatory side; but the A term will be 
much larger than the B term for large M , and the approximation on the exponential side 
is ruined. Similarly if the approximation is | x | -1/4 cos {L + in) on the oscillatory side it 
must be y~ 1/4 exp M on the exponential side, but the converse does not follow. In using 
the formulae (9) and (10) confusion may be avoided if explicit attention is paid to the 
error terms throughout the work. 

The device for crossing a zero of x was first given by RayleighJ and extended by 
R. Gans.§ It has been rediscovered by several later writers. Langer points out that when 
X has two zeros the equation can be transformed to one of the forms 

^ = ±W-i )+e(lh)}v 

with approximate solutions treated in 23*08, which can be used in a similar way to the 
solutions of y" — h 2 E,y in the case of one zero. 

* For these conditions and those imposed in 17*13, see H. Jeffreys, Proc. Camb. Phil. Soc. 49, 
1953, 601-11. 

•j- Especially Bull. Amer. Math. Soc. 1934, 545-82; Trans. Amer. Math. Soc. 67, 1949, 461-90. 

t Proc. Roy. Soc. A, 86, 1912, 207-26. 

§ Ann. d. Phys. 47, 1915, 709-36. 
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17*132 Bessel functions of large order 


17*132. As an example of the method, take Bessel’s equation for large order 



(1) 

and put 

x = ne~ z . 

(2) 

Then -^ = n 2 (l-e~ 2 *)y, 

the osoillatory side being x>n or z< 0. Then for x<n 

(3) 

M = n\ 

( 1 — e _2s ) 1/2 dz = n(d — tanh 0), 

A 

(4) 

J 

where 

u 

x — n sech 6. 

(5) 

Eor x > n, L = n 

f (e 2 ^ — 1 ) 1/a = n(tan u — u) t 

to 

(6) 


where x — n sec u. Then there is a solution equal, for x>n, to 

2 tan -1 / 2 u [sin (w(tan u — u) + \tt) +0(1 In)] 

and for x<n to 

tanh -1 / 2 6 exp {— n( 0 —tanh 0)} {1 + 0( 1 /n)} 

~ j {n + (n 2 —a; 2 ) 1/2 } -w exp (n 2 — x 2 f l2 {l + 0(1 /n)}. (7) 

When x is small this is approximately 

x n ( 2 n)~ n e n . (8) 

But the first term in the expansion of J n (x) is ~ and if we approximate to the 

factorial by Stirling’s formula we see that our solution is a representation of (2Trnf^J n (x). 
Another solution for x>n, is 

2 tan~ 1/a u [cos (w(tan u — u) + r}+0(1 jn)] 

and for x<n 

2 tanh _1/2 0 exp {n (6 — tanh 0 )]{1 + 0( 1 fn)} 

( go 2 — a ? 2 \ — 1/4 

—^2—) i n + (^ 2 — # 2 ) 1/a } n exp { — (w 2 — x*Y k }{l + 0(1 fn)}, (9) 

which is a representation of the second solution of Bessel’s equation denoted by 
- (27m) 1 / 2 T n (x) (21-02). 

The errors of these approximations are of order 1 fn. They have one great advantage 
over those in descending power series, that they can be used to fix the adjustable con¬ 
stants so that the solution will represent the same function as the ascending power series 
(the gap near the zero of y 0 being filled in, if necessary, by direct use of the Airy integral). 
The corresponding adjustment for descending series usually requires the indirect method 
of complex integral representation, if such a representation exists; and if it does not, we 
may be reduced to numerical comparison in some range where both the convergent and 
the asymptotic expansions can be computed directly. 

Applications of the method to Mathieu functions are given by H. Jeffreys, Proc. Lond. 
Math.Soc. (2), 23,1924,437-76; to the transparency of a potential barrierin wave mechanics 
by B. Jeffreys, Proc. Camb. Phil. 80 c. 38,1942, 401-5. Another method of treating this 
problem can be based on the solutions of the differential equation treated in 23*08a. 

Extensions to cases where y vanishes to higher orders than the first are given bv 
S. Goldstein.* 

A method, based on a suitable change of the independent variable, applicable to non¬ 
linear ordinary and partial differential equations, is given by M. J. Lighthill.f 

* Proc. Lond. Math. Soc. (2) 28, 1928, 81-90. t Phil - Ma 9- (7) 40, 1949, 1179-1201. 
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Examples 


EXAMPLES 

1. If n = x — h, where 0<h< 1, x>0, prove that 

J °° f 00 / v\- K+h 

t- n e-*dt = x- n e~ x J I l + -j e~ v dv 

rsj aj _n e~*|| + (J + ±h) - + (— + ^h 2 ) — +. .. j, 

and indicate the use of such an expansion in improving the approximation to the incomplete factorial 
function given by the asymptotic expansion of 17-01. (Bickley and Miller.) 

2. 'A solution of the equation 

y" + 256ye 4 * = 0 

is zero at x = 0. Determine approximately where its other zeros lie. 

If in addition y' = 1 at x = 0, find the position and magnitude of the first maximum for x>0. 

(1.0. 1936.) 

3. Show how to approximate to the solution of the equation 

d 2 y y 


dx 2 X 4 


0 , 


( 1 ) 


where X, a function of x, is such that X" is small in comparison with 1/X*. Hence deduce the exact 
solution when X" = 0. 

Compare the solution with that of 

/i 2 i/ w 

( 2 ) 


^+-^= 0 . 

dx 2 X* 


4. If 


d 

' dx 


(4) 


(I.C. 1937.) 


+ (n 2 -x 2 )y = 0, 


where n is large and t/ -> 0 as a; ->■ oo, prove that y is a multiple of a function given asymptotically by 

\ —V4 ( /%? \ l /a x ) / X^\ 

--1J exp| — nl — — 1J + »sec -1 —|, ( x>n ), 2ll- —I sinn{logtan(£zr + Jyfr-) -sin ^ + £ 77 -}, 


(x<n), where cos ’jr = xjn. 
5. If 

where n is large, prove that 


dx 


(’I) 


+ (n 2 +x 2 )y = 0, 


/ x 2 \ ~v* cos / , , % 

j/iv H—1 . nfsecr + logtaniw), 

\ n 2 J sm 

where v — tan -1 x/n. 

6. If a; is large and positive, prove that 


J; 


Ai (x) x~y* 

0 3 2 


and if * = — £, where £ is large and positive, prove that 


J; 


7. Prove that 


Ai (a?) dx~ — -+~ cos (|£ % + |7r). 

0 3 <Jn 

Ai (z) Bi' (z) - Ai' (z) Bi (z) = - 

TT 


verifying that the same constant is obtained by taking z small, z large and positive, or — z large and 
positive. 

8. If \-n > | arg z [ > d > 0, prove that Stirling’s formula can be extended to (— z)!. 

(Use z!(— z)\ — 7rzcosec ttz.) 



Chapter 18 


THE EQUATIONS OF POTENTIAL, WAVES, 
AND HEAT CONDUCTION 


Divide et impera. 

LOUIS XL 


18*01. The gravitational potential in free space satisfies Laplace’s equation 


V 2 0 = 


, 5 ¥ _ n 

dx 2 dy 2 dz 2 


( 1 ) 


The same equation is satisfied in free space by the electric and magnetic potentials if the 
field is steady, and by the velocity potential in incompressible fluid. In a uniform com¬ 
pressible fluid the velocity potential satisfies 




(2) 


where c is the velocity of sound, provided that the velocities are small. For elastic waves 
in a uniform solid the same equation is satisfied by the scalar and vector potentials, which 
give the longitudinal and transverse displacements respectively, with two different 
values of c. The same equation is satisfied by the field components in electromagnetic 
waves, c being then the velocity of light. In a uniform material at rest the temperature 
satisfies 

v ¥ = ^|. ( 3 ) 

where h 2 is the thermometric conductivity, defined as the thermal conductivity divided 
by the heat capacity per unit volume. An equation of the same form is satisfied in 
diffusion if (j> is the concentration of the diffusing material. 

Evidently in a steady state these two equations both reduce to Laplace’s equation. 
The equations of vibration of water in a lake or channel of uniform depth, and that of 
vibration of a membrane, are the wave equation without the term d 2 <pjdz 2 . 

These three equations are so widely applicable that they are often called ‘ the differential 
equations of physics’. They do not include the wave equation of quantum mechanics, 
but even for this more complex equation the study of these equations is a necessary 
introduction. 

The possibility of solving them depends chiefly on the fact that they are separable in 
several systems of coordinates; that is, they have solutions that break up into factors, 
each factor being a function of one coordinate or t. We try to choose the coordinate system 
so that one coordinate is constant over a surface where a given boundary condition has 
to be satisfied by <j>. Taking, for instance, the wave equation in rectangular coordinates, 
we try a form of solution 


<f> = XYZT, 


( 4 ) 
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where X is a function of x only, and so on. Substitute in (2) and divide by XYZT. Then 

1 d*X 1 d 2 Y 1 d?Z _ 1 d 2 T 

X dx 2 + Y dy 2 + Z dz 2 c 2 T dt 2 ' U 

Each term is a function of only one of the independent variables. Hence if the equation 
is to hold for all values of x, y, z, t each term must be constant, and every expression of 
the form 

Ae'sp^ifx+my + nz—yt)}, (6) 


where A, l, m, n, y are constants and 

l 2 + m 2 + n 2 = y 2 /c a 


.(?) 


is a solution; so is any sum of expressions of this form. The complex exponentials can 
evidently be replaced by cosines and sines. Now we know from Fourier’s theorem that 
in a bounded region the values of <f> and dcfrjdt at t — 0 can in general be expressed in a 
series of products of sines and cosines of lx, my, nz; and then we derive the complete 
solution by associating with each term its proper factor in yt. As an example, consider a 
rectangular membrane whose comers are at (0,0), (a, 0), (0, b), (a, b). Since the margin 
is supposed to be fixed we require a solution that vanishes wherever x = 0 or a, or y = 0 

or 6. Thus the admissible terms will contain factors sin sinwhere l and m are 

a b 

now integers; cosines will not vanish on the edges. The solution will then be 

, * ® . Inx . miry. . , „ . .. 

0 = 2 2 sin—sm —r— (A lt m cos yt + B lt m sin yt), 

Z-lm-l a 0 


with 




Then (assuming the possibility of differentiating the series) 

. . . Ittx . miry 

&-o = £2 J. />m sm—sm—, 


If we know the initial values of <j> and 3 <f>/dt, we can expand them in double Fourier sine 
series, and comparison of coefficients determines A l m and B l m . Thus the solution is found. 

For the oscillations of water in a shallow rectangular lake of uniform depth we have 
the same differential equation to be satisfied by £, the vertical displacement of the free 
surface. The boundary conditions are different. It is now the velocity normal to the 
boundary that must vanish there, and this is proportional to dg/dn. Hence at x — 0 and 
x = a, d£/dx = 0; at y = 0 and y = 6, d^jdy = 0. The appropriate solutions satisfying the 
boundary conditions are now products of cosines instead of sines. Hence £ and 3£/3 1 
for t = 0 must be expanded in double Fourier cosine instead of sine series; the time 
factors are applied as before. Evidently if A 0 0 is not 0 it means that the variation of £ 
is about a mean different from 0, that is, that the origin of £ has not been taken at the 
undisturbed level of the free surface. The mean of (30/3O<—o over fhe rectangle must be 0, 
for if not it would imply that the total quantity of water was varying; hence B^q = 0. 
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The condition that the initial <f> and d<f>jdt can be expanded in Fourier series (in one, 
two or three variables according as the region is one-, two- or three-dimensional) will be 
satisfied by most functions that occur in practice. 

18*011. Equations of elliptic, parabolic, and hyperbolic types. We notice that 
even for wave propagation in one dimension we can get solutions of 

ay _ ay 

8x 2 c 2 dt 2 

that vanish for all time at x = a, b and are not identically zero, and such solutions can 
also vanish for a<*<6if< = 0or 2(6 — a)/c. This property is to be contrasted with that 
of solutions of Laplace’s equation in two dimensions 

ay ay 

dx 2 ~ t ‘dy 2 ~ 

Here if <f> = 0 for all y at x = 0 and x — a, and for all x at y = 0 and y = b,<j> must be 0 
for all x, y within the rectangle. The difference can be traced to the fact that, if we take 
all terms to the same side of the equation, the signs are the same for Laplace’s equation 
and opposite for the wave equation. More generally, if the terms in second derivatives in a 

differential equation are a + 6, the conditions that thesolution can be made 

to satisfy at the boundaries are quite different according as ab — h 2 is positive or negative. 
In the former case if <j> is given on a closed curve it is determined everywhere within it, 
and the equation is said to be of elliptic type. In the latter the equation is said to be of 
hyperbolic type and no such conclusion follows. We shall not enter into the general theory, 
which is given in full by Webster.* o, 6, h need not be constants, but ab-h 2 may then 
change sign within the region. This condition occurs in the motion of projectiles at 
velocities above that of sound and is then connected with the formation of a shock wave.f 

18*012. The equation of heat conduction in one dimension is intermediate in character 
(ab—h 2 = 0) and is said to be of parabolic type. But more generally, since it is of the first 
order with regard to t, we cannot assign the values of <f> and d<f>/dt at t = 0 independently. 
The extension to more dimensions is straightforward and we shall only illustrate from 
the one-dimensional case. If 

y ay 

8* dx 2 

and 0 =s 0 at x — 0 and x = a, a solution is 


, . nnx l hhihrHX 

* = exp(-_3-). 

Hence if <f> at t — 0 is expanded in a Fourier series we get the solution 

, ® . . nnx ( hhPnHX 

^ = e X p(- 

* Partial Differential Equations of Mathematical Physics, Chap. 6. 
t G. I. Taylor and J. W. Maccoll, Proc. Boy. Soc. A, 139, 1933, 278-311. 
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532 Whittaker's solutions 

The characteristic feature is that all the exponents are negative; if the temperature is 
kept zero at two points it will tend to zero everywhere as the time increases. If the series 
converges for t = 0,<f> and all its derivatives of whatever order with regard to both x and 
t exist and will be represented by convergent series for t > 0, however small t may be. 
The universal tendency of heat conduction is to smooth out differences of temperature. 
This, of course, is another way of stating the second law of thermodynamics, but the 
equation of heat conduction provides a time scale for the smoothing and the second law 
does not. When (f> satisfies the wave equation we can say that when t varies the inequalities 
of <f> are not reduced but merely transferred to other places. 

It is possible to have periodic solutions of the heat equation, but only if there is a 
periodic source of heat, either internal or at the boundary. 

Another peculiarity of the equation of conduction is that there is a severe restriction 
on the distributions of temperature that can be the successors of any previous distribu¬ 
tions. We have seen that Fourier series in practice usually converge like 2 n~ 8 , where s is 
a small integer. In a favourable case like the series of 14*05, it converges like 2 ( r/a) n , 
with r<a. Now suppose that (j) was expressible, at time — r(r>0) by a Fourier 
series, however slowly convergent. Then the terms at time 0 would decrease like 
exp (—h 2 n 2 7T % r/a 2 ), which is a faster decrease than for any geometric series. If, then, the 
Fourier series of 0 at t = 0 does not satisfy this condition for some r > 0 it cannot be the 
successor of any previous distribution unless there has been disturbance from outside. 

18*013. Whittaker’s general solutions. The test of whether we have obtained 
the most general solution of a partial differential equation is not a mere matter of counting 
adjustable constants, as it is for ordinary differential equations. We shall only quote 
types of general solution that have been given by Whittaker. A general solution of 
Laplace’s equation is 

<j> = f f(z + ix cos u + iy sin u, u) du , 


where /(£, u) is an arbitrary function; a general solution of the wave equation is 


<j> = f(xsmu cosv + ysinu sinv + zcosu + ct y u,v)dudv. 

J —n J —7T 

These solutions are capable of a wide range of transformation; numerous applications 
are given by Whittaker and Watson. 


18*02. Curvilinear coordinates. For other forms of boundary solutions of the 
form 18*01(4) are seldom obvious, though Whittaker’s general solutions can often be 
adapted to other forms of boundary on transformation of coordinates. We can, alter¬ 
natively, write 


and then 


$ =f{x>y>z) T, 



1 d 2 T 
c 2 T dt 2 ’ 


WdT\ 
T dt)’ 


( 1 ) 

( 2 ) 


according as the equation considered is Laplace’s equation, the wave equation, or the 
equation of heat conduction. Both sides must be constant, and the equations are all 
reduced to the form 


V 2 /=-* 2 /, 


(3) 
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to be solved in accordance with the boundary conditions. Now the wave equation in a 
continuous medium can be regarded as the limit of the set of equations of motion for a 
set of particles forming a stable close lattice. Then if we take T = all possible values 
of y 2 for normal modes are real and negative. It follows at once that with the same 
boundary conditions on f{x, y, z) all values of y for free flow of heat are real and negative. 
The determination of the solutions for given initial conditions therefore reduces to finding 
an expansion of a general f in terms of the characteristic functions of the equation (3). 
The time factor is thus treated separately in any case and we have simply to consider (3), 
where k 2 will be 0 in potential problems. The time factor will in general not be written 
explicitly since it can always be restored at the end. This equation is separable in several 
other coordinate systems besides Cartesian ones. In general curvilinear orthogonal co¬ 
ordinates g x > g 2 , £ 3 are used, the elements of length ds lt ds 2) ds 3 corresponding to small 
changes d£ x , dg 2 , d£ 3 being h x dg x , h 2 dg 2 , h 3 dg 3 . 


Now by Green’s lemma 




(4) 


(5) 


Apply this result to a small volume bounded by the surfaces 

£l = £l0 ± 2^£l> £ 2 = £20 ± £3 = £30 ± P£ 3 . 

On a surface of constant g x , if n is in the direction of increasing g x , 

d<j> d<j> 
dn = hM x ’ 

and the area of the element given by ranges S£ 2 , 8g 3 is h 2 h 3 8£ 2 8g 3 . Then J ^dS over 
such an element is jj -^~-dg 2 dg 3 . The two surfaces given by g x = £ 10 ± |££x will to¬ 
gether contribute 


since on the surface with the larger § x the outward normal is in the direction of increasing 
£1 and on the opposite one it is in the direction of decreasing g x . Then the integral on the 
right of (4) is the sum of these expressions for the three pairs of opposite faces. The element 
of volume is h x h 2 h 3 8g x 8g 2 8g 3 + o{8g x 8g 2 8g 3 ). Hence 






/ 3£ 2 V K BZj + dg 9 \ h 3 dgj 


( 6 ) 


almost everywhere, and everywhere if both sides are continuous. The latter condition 
will usually be obviously satisfied by the solutions we shall obtain, and the equation to 
be solved is a 


1 ( d /h 2 h 3 d0\ d [h 3 h x d<j>\ d lh x h 2 d<j>\\ 

M»M3£i\ K 3£j l K 9£j 0£ s \ K 9£ 3 /)“ 


— K 2 <f>. 


(7) 


The transformation is particularly simple if the coordinate g 3 is z. This choice of g 3 is 
convenient if the boundary is a cylinder of any form of section. Then h 3 = 1, and g x ,g 2 
are functions of x, y only. Now if 

£ i +*'£ 2 =f(x+iy), 


( 8 ) 
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the orthogo nali ty relations are satisfied on account of the Cauchy-Riemann relations, and 



Y*b9 

II 

101* 

11 

d{x+iy) 2 

= h 2 , say. 

(9) 

Then 

1 W.3V, 3„9*\ 

**\3£} 8£| & Zz) 

= -K 2 (j>, 

(10) 

or 

8V, 0 ¥, is 8 V_ 
8g + 8g + W 

-k 2 K 2 4>. 

(11) 


In particular if <f> is independent of z and t the equation reduces to Laplace s equation 
with regard to g x and £ a * 

18*03. Cylindrical coordinates. If 

tjy 3 =s x 2 +y 2 , A = tan -1 y/x, (1) 

log m+ iX = log ( x + iy), (2) 

and we can take 

g x = log m, £ 2 = A, x + iy = eZt+*h, h 2 = \x+iy\ 2 = m a . 
d 


Then 


0 login 3 log 

0 


Logm 3A a \3z 2 / 


(3) 

( 4 ) 


0 VJ 




(5) 

( 6 ) 


that is, 

VLW ^ l/U// 

If 0 = PAZ, 

where P, A, Z are functions of m, A, 2 respectively, 

I d l dP\ ld 2 A , 2 /ld 2 Z \ 

p m teV’di) + Adr‘ +m {zw +K ) = 0 

The second term is a function of A only, the others do not involve A; take it as -n 2 . Then 

d 2 Z „ n 


0) 


1 j. , 1 ~ I ,A " - 0 

tzrP dw \ dm/ Z dz 2 m a 


( 8 ) 


The second term involves z only, the others are independent of z. Take it to be fi 2 . Then 


Finally, put (k 2 — /t a ) 1/a m = £; then 

#f) +(MP = 0, (10) 

which is Bessel*8 equation. 

This transformation has a singularity at the origin; that is, A is not a single-valued 
function of x and y if we are permitted to make a complete circuit about the origin. But $ 
must be single-valued, and therefore A must be. This can be ensured if n is an integer, for 
cos 7 iA and sinnA are then single-valued functions of x and y, but not otherwise. If the 
solution is to hold within a complete circle n must therefore be an integer, which without 
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loss of generality can be taken positive. But if it is to hold only within a sector a < A < y# 
there will in general be two boundary conditions at A = a and A = /?. Suppose for instance 
that <j> is to vanish at all points of the boundary. Then we are limited to solutions of the 
form 

A = sin n{\ — a), 

where n is restricted to satisfy 

sinn(/ff—a) = 0, 

and again only a discrete set of values of n are possible. It is a general feature of trans¬ 
formations with a singularity within the region considered that the single-valuedness 
of the solution introduces what is virtually an extra boundary condition. 

Similarly if <j> is required to be finite at the origin we are limited to the solutions J n (g), 
for the other solutions Y n (g) of Bessel’s equation are infinite at the origin. If further (j> is to 
vanish over the circle m = a we must have J n {(tc 2 — ju, 2 ) 1 ^ a} =* 0. J n vanishes for an infinite 
number of values of the argument, and the boundary conditions determine a set of pos¬ 
sible values of /e 2 -/* 2 . Further, if the solution is to hold within a cylinder of finite length 
there will be a restriction on the possible values of fi. If the solution is to hold only between 
two circles the other solution of Bessel’s equation will also be required if the boundary 
conditions are to be satisfied, and the new boundary condition will determine the ratio 
of the coefficients. 


18*04. Parabolic cylinder coordinates. Take 

Zi+iZi=(x+iy)\ *-3-3, y = 2^. 

If is constant, we can eliminate £ a and get 


x —,g\ — 


r 

4 


( 1 ) 


(2) 


a set of parabolas with a common focus at the origin and proceeding towards negative x. 
If is constant we get similarly 


x = _ _ 

4|S **’ 


(3) 


a set of confocal parabolas with their axes towards x = + oo. Then 


If also 

there will be solutions of the form 


h 2 = 4(0 + 3). 


0V 

0Z 2 




if 


4* - 


d 2 X x 

<*3 

d*x t 

dg\ 


+ {4(* 2 -/f 2 ) — a] X x = 0, 

+ {4(* 2 -/* 2 )3 + a }X a = 0, 


(4) 

(5) 

( 6 ) 

(7) 

( 8 ) 


where a is a constant. The substitution g x = irj turns the first equation into the second, so 
that solution of problems relating to parabolic boundaries requires the solution of the 
same differential equation for purely real and purely imaginary argument. If k 2 >/i* 
solutions will be oscillating for. both g x and £ a if they are large enough. 
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This equation arises in the theory of tidal oscillations in parabolic bays. It is also 
the equation of the harmonic oscillator in wave mechanics. We can provide that £j and 
£2 shall be single-valued by never crossing the axis of the parabolas on one side of the 
origin. 

18*05. Elliptic cylinder coordinates. Take 

x+iy = c cosh (£ + «/), x = c cosh £ cos y t y = csinh£ smy. (1) 

The curves of £ constant are the ellipses 

x 2 y* _. (2) 

c 2 cosh 2 £ c 2 sinh 2 £ * ' ' 

and those of rj constant are the confocal hyperbolas 


c 2 cos 2 y c 2 sin 2 y 

If we take £ always >0 we describe a complete confocal ellipse by increasing y from 
0 to 2n. We have also 

h 2 = | csinh (H+iy) | 2 = c 2 (sinh 2 £ cos 2 y + cosh 2 £ sin 2 y) 

= £e 2 (cosh 2£ — cos 2y), (4) 

and if again (5) 

+ |c 2 (cosh 2£ - cos 2y) (k 2 -/t 2 ) <j> = 0. (6) 

The standard solutions can therefore be taken as 


where X and Y satisfy 


F = XYZ, 

V" 

?L±-{R-$c*(K*-/i*)cosh2g}X = 0, 

/J2Y 

-—+{R- 1 c*(k 2 -fi 2 ) cos 2y} Y = 0. 


We write |c 2 (/c 2 — fi 2 ) = 16 q. (10) 

The second of these is Mathieu’s equation. R is a constant, to be determined in such a way 
that the solution has period 27r. Evidently if q = 0, R must be the square of an integer. 
Since the coefficient of Y is an even function, there will be one even and one odd solution 
for any pair of values of R and q. It has been shown by Ince and others that not more 
than one of these can be periodic except for q = 0, and the datum that one is periodic 
determines a discrete set of values of R. The even solutions are denoted by c e n (y,q), the 
odd ones by ae n (y, q). Changing £ to id in (8) reproduces (9). 

Since we always take £ positive the only way of comparing values on opposite sides of 
the line joining the foci is to change y to 2n—y. d<j>jdx and dfijdy are to be continuous 
across this line. But on £ = 0 

„dX d<j> dtfidx d<J>dy d<f> , , _ d<j) „ . . d<f> 

rx e 3 inhSeos,>+ ry ceosHsm,l= cemr >w (11) 


dr ^ 

dy dy 


dd> . r . _ d<b 

”Ceosh£ sm?/ +g^csinh£ cos?/ = — c sm y ~. (12) 



18 * 06 - 18*061 


537 


Spherical and spheroidal coordinates 

Hence at £ = 0 , Y^- and -J— X^- are unaltered on changing rj to 2n — rj. If Y is 

sm if sm ?7 dr) 

an even function of rj, Yj sin 17 changes sign, and the conditions require that dXjdE, — 0 . 
Therefore X is an even function of £. Secondly, if Y is se w (?/, q), an odd function, dYjdr) 
does not change sign and therefore X = 0 . Hence X is an odd function. The possible 
types of solution (in real form) are therefore 

XY = ce n (i£, q) ce n (rj,q), IF = -fcse n (i£,g)se n (? 7 , 2 ). (13) 

Mathieu’s equation has mathematical interest because it is the simplest second-order 
equation with a periodic coefficient. The restriction that the solutions also must have 
period 2n is required by the physical conditions in problems of membranes and tidal 
oscillations. There are, however, many problems of vibration where the restoring force 
contains a periodic coefficient, and these can be treated by similar methods to those used 
for the solution of Mathieu’s equation.® 

18*06. Spherical and spheroidal coordinates. The special feature here is that 
one coordinate is constant over the surface, respectively, of a sphere or an ellipsoid of 
revolution. In either case there is an axis of symmetry (for the sphere, of course, an i n fi ni te 
number) and one coordinate £ 3 = A can be taken to be the azimuth about it and the other 
two to be orthogonal coordinates in planes of A constant. If we also take cylindrical 
coordinates w, A, z we can simplify the analysis by making use of the fact that 


w 8 (cos sA ■+■ i sin sA) = (x + iy )* ( 1 ) 

gives a pair of solutions of Laplace’s equation. In spherical coordinates, then, we know 
at once that there is a family of solutions r 8 sin 8 0(eos sX, sin sA). Now if we denote one of 
these solutions by M we are led to look for other solutions containing it as a factor, say 
FM, where F is independent of A. Then 

dFdM 

V*(FM) = MV 2 F + 2 ^ + FV*M, ( 2 ) 

and the last term is 0 . Also the second term is unaltered by rotation of the axes; therefore 


dF dM _ dF dM dFdM dFdM 
dx i dx € ds x ds t ds 2 ds 2 ds 3 ds 3 * 

But ds 3 = wdX, and dFJds s — 0 . Hence 

V*(FM) = MV*F + 2 (|? ^ . 

18*061. Spherical polar coordinates. 

dsj = dr, ds % = rdd, ds 3 = rsmddX, 


and 


( rW l) + l>( sine !) + 

( r ‘ d 4) + mm( aine %)‘ ** 


r 2 sin0|3r 
d 

r 2 dr 


r 2 sin 2 03 A 2 * 


Since Laplace’s equation is homogeneous in r we try 

0 = FM, F = r n ~ 8 ®. 


(3) 

(4) 

(5) 

( 6 ) 

(7) 
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where © is a function of 0 only. Then 

V 2 F = {n—s) (n—s + 1) r n ~ 8 ~ 2 ©+ r n ~ 8 ~ 2 


sin 6d6 


h e w)- 


.. BFdM BFBM J 

Ako = { s(n - s)0M+sMoot0 
Hence in spherical polar coordinates 


d& 

dd 


18 061 

( 8 ) 

(9) 


VW) = ( sine §) + 2 »ootfl^+(»-,) (n+e+ 1) ©]. (10) 

If V 2 ^ = 0 this gives a differential equation satisfied by 0; we put cos 0 = /i, and it becomes 


When 5 = 0 this is Legendre's equation. 

We denote the solutions by 0 lf 0 2 . The solutions of Laplace’s equation in spherical 
polar coordinates are therefore of the form 


t27 a (cos sX, sin sX) r n ~«(0 1 ,0 2 ) = r n sin* 0(0 1? 0 2 ) (cos sX, sin 5 A). (12) 

We have not assumed n positive, and we see that (11) is unaltered if we replace n by 
— n— 1; hence there will be another set of solutions with r - ” -1 instead of r n , the other 
factors remaining the same. If we express 0 in a power series in [i there are two solutions, 
one an even and the other an odd function of /i. If n and 5 are integers and n—s is even, 
we shall see that the even series terminates; if n—s is odd the odd series terminates. For 
the terminating series, with an appropriate constant factor, we write 

sin* 0.0! =p*(/0, (13) 

and for the other sin* 0 . 0 2 = q* n (/i). (14) 

There is an exceptional case when 5 = 0; for if 


the general solution is 


a 2 A 

BX* 


0.A 


A = A + jBA. 


(16) 


But if 0 is to return to its original value when A is increased by 2 n, B must be 0. Hence 
for 5 = 0 the A factor is a constant. (11) can be written 


Differentiate this: ( 1 “A 2 )© <l, - 2 ( 5 + l)/*©' + (»-5) (n+5 + l)0 = 0. (16) 

(l-/* 2 )0 w -2(5 + 2) / a0' + (tt—5-l)(n + 5 + 2)0' = 0, (17) 

which is the same equation with 0' for © and 5 +1 for 5 . This differentiation does not 
assume 5 positive. Hence if 0 satisfies the equation 


(1 -/i 2 ) G” + 2n/iG' = 0 


obtained by putting 5 +1 = —n in (16), 

0 = 


fln+s+i 

- G 

d/jn+s+i 


O' cc (/4 a — l) n . 


(18) 

(19) 

( 20 ) 


will satisfy (16). But (18) gives 



18*062 

Hence 


Spherical polar coordinates 

d n +* 


0oc 


d/i n+8 


(/t 2 - l) n , 


and a family of solutions of Laplace’s equation in spherical polar coordinates is 

d n + 8 

r n ~*{x + iy) 8 (/** - 1 ) n » 


d/i n+8 


that is, 


dn+8 


r* sin* 0 (cos sft, sin sft ) ^~^f s O ^ 2 — 1 ) n * 
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( 21 ) 

( 22 ) 

(23) 


We have obtained only one solution of (16). The nature of the second may be seen by taking 
8 = n; the solution that we have found will be constant. The other is 


0' OC {/l 2 — l) -(n+1) , 


(24) 


and its (n —s + l)th integral, with 8^n, will be infinite of order sin -2s 0 at y, = ± 1 and 
contain a logarithm. Hence this solution is inadmissible for a complete sphere. It has 
other applications, however, and will be considered further under Legendre functions. 

Postponing the choice of the constant factors for 5 4 = 0 , we denote the solutions by 
r n p* n (/u) (cos 5 A, sinsA), r n ^ n {/i) (cos sA, sin sA). For 8 = 0 the constant is chosen so that 
P°n( 1)-1. 

We now return to V 2 0 = —K 2 ft, 


and put ft = RS n (6, A), 

where r n S n {d, A) is a solution of V 2 ft = 0 . Then 




Put 

Then 


R = r~^K. 


>d 2 K dK 


dr 2 +^+{^ 2 -(n + OT^ = °, 

and K = AJ n+ i h (icr) + jBF n+% (/cr). 

The required solutions are therefore 

f> = r^{J n+ i h (Kr), Y n+% {/cr)} {p* n (cos 6), <f n (cos d)} (cos sX, sin sX). 
The Bessel functions of order half an odd integer are expressible in finite terms. 


(25) 

(26) 

(27) 

(28) 

(29) 

(30) 

(31) 


18*062. Oblate spheroidal coordinates. We replace the coordinates m, z by £, tj 
according to 

m = c cosh £ cos tj, z = csinh£ sin 7 , ( 1 ) 


and again seek for solutions of the form 

ft = FM = F(^, 7 /)t( 7 8 (cossA,sinsA). 
Then d8 1 = hd£,- ds 2 = hdij t ds z = wdX t 

where h 2 = c 2 (cosh 2 £—cos 2 tj). 


( 2 ) 

(3) 

(4) 
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Then 


And if 


Spheroidal coordinates 

^-k(S(- 3K(*?)4(S if 

_ 1 jd 2 F d 2 F 1 dmdF 1 dm dF\ 

~h 2 \W + W + ™d £ a£ + ^W^?J’ 

1 id 2 F d 2 F x , r dF x dF) 

= n 2 W^w + 

dF dM dFdM sM / , y dF ± dF\ 

ds x ds x + ds 2 ds 2 ~ h 2 ( tanh ^ d£ tani? ~dij) * 

Mjd 2 F d 2 F /n 1VX , r dF in ,,, dF) 
F = X(£)Y( V ), V 2 <j> = -K 2 </>, 


18063 

(5) 

( 6 ) 

(?) 

( 8 ) 

(9) 

( 10 ) 


f {^ + ^ 2s+1 ) tanh ^^} + Y{^ _ ^ 2,S + 1 ^ tan?? ^} =- K2c2 (cosh a £-cos s 7i), (11) 

( 12 ) 


X 
whence 


72 y 7 y 

-^2 + (25 +1) tanh £^| + (x 2 c 2 cosh 2 £ - 72) X = 0, 


^ 2 y jy 

-^-(2s+l)ta > n7i-^-{K 2 c 2 cos 2 7i-R)Y = 0, 

where 72 is constant. 

In the second of these put sin tj = ju,; we get 
d%Y dY 

(1 " /t2) V" 2(5+1)/t ^ + {i? "^ c2(1 "^ 7=0 ’ 

which, apart from the term in k 2 , is the equation satisfied by the function 0 for spherical 
polar coordinates, with 72 = (n — s) (n + s + 1). In the first, similarly, put i sinh £ = v; then 


(13) 


(14) 


(1 ” y2) ^ “ 2(5 + 1} ^ + {jB " * 2c2(1 " v *» X = °* 


(15) 

(16) 
(17) 


which is the same equation. Hence for k 2 = 0 

X cosh® £ = Ap s n {i sinh £) + B<f n {i sinh £), 

Y cos s 7 = Cp 3 n (sin tj ) + Dg^ (sin?;). 

These give the solutions of Laplace’s equation. The presence of the terms in k 2 con¬ 
siderably increases the complexity of the solutions of the other equations. 

18*063. Prolate spheroidal coordinates. Take 

w — c sinh £ sin 7, z = c cosh £ cos 7, 

and proceed as before. We get 


to y 7 y 

—^2 + ( 2s + 1) coth !z-j= + ( k 2 c 2 sinh 2 £ — 72) X = 0, 


di 

d 2 Y 


di 

dY 


+ ( 2s + ljcot^-^ -f ( k 2 c 2 sin 2 7 -f- 72) T = 0. 


( 1 ) 

( 2 ) 

(3) 
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In the second of these put cos 7) = [i\ then 

and in the first put coshf = v\ then 


( ”’ l_1) ^ +2(s+1), '^ +{ ' c2 ‘ ;2(1 ' 2_1)_ - B}x = °- 
In this case, if r x and r 2 are the distances of a point from the foci (0, 0 , ±c) 


v = 


*i + ?2 
2 c 


fi = 


r x-r 2 

2c 


541 

(4) 

(5) 

( 6 ) 


With R = (n — s) (n + s+ 1 ), k = 0 , (7) 

X sinh® £ = Ap* n (cosh £) + Bg£(cosh £), ( 8 ) 

Fsin 8 ^ = Op% (cos 17 )(cos 9 /). (9) 

Both here and in 18*062 the (f n solution is inadmissible within an ellipsoid if £ = 0 is a 
possible value. In both cases h 2 — 0 where £ = rj = 0 , and it is found on examination 
that the gradient of p 8 n <f n would tend to infinity for these values.* Hence within an 
oblate ellipsoid of revolution the solutions are p 8 n (i sinh £) p®(sin rj ), and within a prolate 
one they are p® (cosh £) p® (cos rj ). But in problems relating to the outside of an ellip¬ 
soid £ may be indefinitely large. If the solution is not to tend to infinity at a large 
distance p® is inadmissible, but q* n becomes admissible because £ cannot be 0 in the 
region considered and <f n {v) can be defined, except for — 1 < v < 1 , so that it tends to 0 
for large |v|. The solutions outside will be <f n (i sinh £) p® (sin rj ) and g^(cosh£)p®(cos 17 ) 
respectively. 


18*07. The equation is also separable in general ellipsoidal coordinates, and leads to 
Lam 6 functions, which are an extension of Mathieu functions.f 

18*08. Orthogonality relations. If <j) and $5' are any two functions with continuous 
second derivatives in a region, 

= ( 1 ) 

the first integral being through the region and the second over its boundary, dn being in 
the direction of the outward normal. If <J> and <f>' satisfy Laplace’s equation in the region, 

11*%**- SI *fn dS - 

Take the surface 8 to be a sphere, and take 

0 = r m 8 m (d. A), = r n 8 n {6. A). 


* For details see Hobson, Spherical and Ellipsoidal Harmonics, 1931, 422. 
f Webster, Partial Differential Equations of Mathematical Physics, pp. 331-42. 


( 3 ) 
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Then the equation reduces to 

mlfr™+*-'S m S„dS = njfr*+»-'-8 m S„dS, (4) 

and r is constant over the boundary. Hence if m 4= n, 

jjS m S n smddOd(p = 0. (6) 

We express this verbally by saying that any two surface harmonics of different degrees are 

/* 2 ir 

orthogonal. It is the analogue for a sphere of cos md cos nddQ = 0 and its companion 

Jo 


relations when m=f=w. 

If <f> and <f>', instead of satisfying Laplace’s equation, satisfy 


= -ac 2 0, 

and both <p and <j>' satisfy a boundary condition on S of the form 


( 6 ) 


(7) 


where a and b are constants, the same for both solutions, it is clear that if either o or b 
is 0 both JJ <j>—dS and JJ <f>'~^dS are 0. If neither of a and 6 is zero 

W£~^i) dS - -i/JV-WM=o- 

= 0 , 


( 8 ) 


Hence 


(9) 


that is, 

Hence either k 2 — at' 2 or 


(k*—k' 2 ) jj dr = 0. (10) 

JJJ# , dr = 0. (11) 

This gives a further set of orthogonality relations. Thus for a ciroular cylinder typical 
solutions are J m {KVj) cos raA, J n (K'w) cos wA, where ac and k' may be chosen so that on the 
boundary m = a 

J m ( Ka ) = 9, J n ( K ' a ) = 0, or J' m (/ca) = 0, J' n (K'a) = 0. (12) 

Ca r2n 

Then if ac 4 = ac', I J m {Kw) cos mX cos nXwdrudX = 0. (13) 

Jo Jo 

This is satisfied if m 4 s n; but if m = n we must have 


i wJ m {Kw) J m (K'rn) dm = 0, 
Jo 


(14) 


if ac and ac' are two different roots of J m (/ca ) = 0, J' m (Ka) = 0, or of any equation of the form 
a J m (fca)+fiKJ' m (Ka) = 0. 

Similarly, by applying the argument to a sphere, for which the typical solution valid 
in the interior is r~ l ^J n+ i^(Kr) P® (cos 6) cos sX, we get 


J o rJ m+ i h (Kr)J m+ i k (K'r)dr = 0, 


(15) 
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18*09 MacCullagh's formula 

where k and k' are different roots of J m+ i/ 2 (/ca) = 0, J^ + i/ a (*a) = 0, or of 

J m**h(**) ■ 0 d J m+ tt(*») = Q 

a 11 * P da a 1/a 


The orthogonality relations determine at once the coefficients in an expansion in terms 
of the characteristic solution, provided such an expansion exists. For if the solutions are 
0 O , 0i • ••• (in general a three-dimensional set) and the function f(x, y, z) is assumed 

to be 


f(x,y,z) = 


(16) 


then on account of the orthogonality relations 

////(*» = J//( S «m0m)0rA = ®n///0n^ (17) 

whence a n is determined. The proof of the existence of an expansion of the form (16), 
however, is long and rather difficult. Two methods are used, the Sturm-Liouville method 
based on direct study of the differential equation,* and the method of Green’s functions, 
which uses the theory of integral equations.! It may be said that the conditions required 
are similar to those for Fourier’s series theorem. The integral equations method is very 
beautiful, but unfortunately too long for this work. 


18*09. Potential at external points: MacCullagh’s formula. Let 0 be an origin 
taken at a point of a distribution, P(x t = rl { ) an external point, Q(x’ { = rT { ) an internal 
point. Put PQ = R. Then the potential at P due to a distribution of density or electric 
charge is 

( 1 ) 




dS 
R ' 


Now when r' < r, if 0 is the angle POQ, 

1 1 
R 


_ 1 r'cos# r' 2 (3cos 2 0- 
*J(r 2 — 2 rr' cos 0 + r' 2 ) ~ r + r 2 2r 8 


1 ) 


+ ... 


( 2 ) 


on expansion. We need take only the volume integral as the modifications for the surface 
integral are obvious. Since 


cos 6 — 1^, 

* = 7 !Sl p 7 +7 SS! pr ' lj ? dT+r ll! pr 

Now JJJ pdr = M, 

the total mass or charge. The coefficient of yljr 2 is 


Mhh -1 

2 r® 


dr + 


(3) 

(4) 

( 5 ) 


fjjpx'idT = Mx it (6) 

which we can make zero by taking the origin at the centre of mass or of charge, provided 
that M is not zero. (It does not appear to be noticed as a rule that in electrostatics a 
charged body has a charge centre in complete analogy with the centre of mass.) The 
coefficient of yr~ z is 

/// kpfih lk x 'i x k ~ r ’*) d T = ¥ih S!Sp(Zx<x'k-r' 2 8 ik )dT. (7) 


* Ince, Ordinary Differential Equations. 

f Erhard Schmidt, Math. Ann. 63, 1907, 433-76; A. Kneser, ibid. pp. 477-624; F. Smithies, 
Proc. Land. Math. Soc. 43, 1937, 266-79. 
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Now if I ik is the inertia tensor at 0 

I a =\\\p(r'% J 1 -<x' k )dT, ( 8 ) 

and (7) may be written § hlk^ik + jjjpr'*dT. 

But I ik lil k is the moment of inertia about OP. If we denote this by I and the principal 
moments of inertia at O by A, B, G 

‘1>=Y^+£i^+ B +C-3I) + 0 (±y (9) 


This is known as MacCullagh’s formula. 

With an obvious extension the work function due to two gravitating bodies whose 
centres of mass are r apart is 


or 


+ ^(A + B + C-3I) + ^(A' + B' + C-31^ + 0^. (10) 


MacCullagh’s formula is also correct right down to the surface of a gravitating solid pro¬ 
vided that this surface is an ellipsoid, the squares of whose departures from a sphere can be 
neglected. The justification in the latter case, however, is quite different from the one just 
given, and requires the theory of expansions in spherical harmonics over a sphere (24*06) . 

In the case M = 0 it is clear that the first term of (4) is 0 and that no change of origin 
can alter the values of the second, since the effect of any change of origin on the coefficient 
would be multiplied by the zero factor M. This case arises in magnetism and in some 
problems of dielectrics. In spherical polar coordinates the term in 1/r 2 is 



cos d cos 6' + sin d sin O' cos {<f> — <j >')} dr. 


That in 1/r 3 has the same form as before but the moments must of course be taken about 
the same point as was used as origin for the term in 1/r 2 . 


EXAMPLES 

1. If /(£) is an analytic function of £ in a region R including the origin and a segment of the real 
axis, and if z, nr, A are cylindrical polar coordinates, prove that the integral 

/*2 IT 

f(z + im cos a) da 

J 0 

is a potential function in the corresponding region of the variables z and nr. 

By taking/(£) to be tan -1 (a/£), or otherwise, verify that the free distribution of electricity over the 
conducting circular disk z = 0, m^a has a surface density proportional to (a 2 - or 2 )- 1 /*; and show that 
the capacity of the disk is 2 a/n. (M.T. 1935.) 

2. Six equal point charges e are situated on the axes at equal distances a from the origin, forming 
the vertices of a regular octahedron. It is required to expand the potential due to them in the neigh¬ 
bourhood of the origin. Prove: 

(i) that the expression for the potential must be invariant for changes of sign and for interchanges 
of the coordinates (a;, y, z); 

(ii) that therefore the lowest terms in the expansion can be written A + Br 2 + Cr* + D(x* + y* + z*), 
where r 2 = x* + y % + z 2 ; 

(iii) that in order to satisfy Laplace’s equation, B = 0, C = —\D. 

By calculating the potential at (x, 0,0), where x<a, show that D is positive and find its value. 
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If instead of six equal charges there are eight, situated at the comers of a cube whose edges, each 
of length 26, are parallel to the axes, and whose centre is at the origin, obtain the corresponding 
expansion of the potential near the origin, showing that in this case D is negative. 

(M/c, Part H, 1938.) 

3. Show that the mutual potential energy of two small magnets of strengths p. 1 , jx a whose centres 
are at the points r x , r a is 

1 * 1 - 1*2 

In-^al 3 1 1*1 —f 2 | 5 

4. Obtain the solution of the equation of heat conduction satisfying the conditions 

F = 0 (x = 0, x = a). 


V = x 2 (a 2 —x 2 ) (0 ^x^a, t = 0), 

F->0 (2-s- 00 ), 

intheform V = 2a‘ £ [(-+ . (X. C . 1944.) 

n = l L \n d 7r 3 n b 7i b J \n 3 7T 3 n b n 5 /J a ' 

5. Obtain a solution of the equation 

d*z 8 2 z _ 

8x 2 dy 2 

in the form z = f(x) g(y), such that z = 0 when x = 0 and y = 0, and dzjdx = 0 when x = n. 


6. A rectangle has its sides of lengths a and 6 maintained at temperatures 0 and 1 respectively. 
Find the steady temperature at any point, and show that at the centre of the rectangle its value is 

4 « (-1)" , na 

^„5„2^Tl 8ech<2n+ *> 26* < LC ' 1944 -> 


7. A circular membrane of radius a is fixed at the edge and under unif orm tension P. Show that 
in a symmetrical normal displacement the potential energy is 


r ^ iP Sl{^ 2,,rdr - 

Assuming 2 = 0^1-^^!+/?^ 


and suitably adjusting ft, obtain an estimate of the longest period. 


(I.C. 1940.) 


8. A plane area of heat-conducting material is bounded by an ellipse and its major axis. The 

curved boundary is maintained at temperature V x and the straight boundary at F a . Find the 
steady temperature at any point within the area. (I.C. 1940.) 

9. A conducting solid is charged to potential <f> s . Prove that there is a point of the solid such 
that if t is the distance from this point the potential at a large distance is 



where c is the electrostatic capacity of the solid. 

Two conductors of capacities c l5 c 2 , a long distance apart, have charges e lt e a and pbtentials 
<f> x , <p % . Prove that the potential energy F is given by 


2F = 


£ + e | 

C 1 c 2 


2Ci 6n 

+ - J - ? + 0 
r 





where r is the distance between the charge centres of the conductors. 


j MP 
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Chapter 19 


WAVES IN ONE DIMENSION AND WAVES 
WITH SPHERICAL SYMMETRY 


19*01. Vibrating string: d’Alembert’s solution. In a large class of physical 
problems we meet with the differential equation 


dt 2 dx a ’ 


( 1 ) 


where t is the time, x the distance from a fixed point or a fixed plane, and c is a known 
velocity. The general solution of this equation was given by d’Alembert. We take as 
new variables 


u = x — ct, v — x + ct. 


( 2 ) 


and then by transforming the differential equation we get 

JSL = 0 . 

dudv 


(3) 


It follows that dyjdv is independent of u, and therefore a function of v only; and inte¬ 
grating again we see that y must be of the form 

V = f{v) + fl'M = f( x ~ ct ) + 9( x + ct )• ( 4 ) 

Further, any functions f and g substituted in this equation will give a solution of (1) 
provided that they are twice differentiable. 

Consider a uniform string under tension P, with mass p per unit length, and suppose it 
displaced transversely so that the displacement y and its gradient dyjdx are small at all 
points. Then to the first order of small quantities the transverse component of the tension 
is P dyjdx and is communicating transverse momentum to the part of the string to the 
left of re at a rate Pdyjdx. Hence 





(5) 


and on differentiating with regard to x, assuming this 


permissible, 


0 ^_ d^y 

p dt % dx 2 ’ 


( 6 ) 



which is of the form (1), with c 2 = Pjp. But (5), which is physically the more fundamental 
equation, only assumes that y is differentiable once with regard to both x and t and jydx 
differentiable with regard to t. Now substitute y =f(x- ct) in (5), assuming that f(x - ct) 
has an integrable derivative. We have 

IW' 1 *- - )*’ ■ -<*)<** 

= “ I* pc W x ~ ct ) _ ct )) 

= pc 2 {f'( x - Ct) -/'(- ct)} 

I- 


(7) 
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Hence (5) is satisfied by a once differentiable function/^-ci), and similarly by g(x + ct)\ 
and (4) therefore satisfies the physical conditions without the need to suppose that y is 
twice differentiable. The point is of some importance because (4) is habitually taken as 
a solution of wave problems where the second derivatives required in ( 1 ) do not exist. 
The differentiation with regard to x needed to give ( 1 ) or ( 6 ) is merely a mathematical 
device to present the problem in a tractable form, which suggests a solution but cannot 
prove that it is right in all the cases where we want to use it. The proof that the suggested 
solution satisfies the mechanical conditions when only the first derivatives exist requires 
the further argument leading to ( 7 ). 

It may be noticed that the restriction that the first derivatives must exist at all points 
is still a little unnecessarily severe. The derivatives may have finite discontinuities for 
some values of x or t. Inspecting the proof of (7) again we see that it still holds pro¬ 
vided that neither x-ct nor —ct is a point of discontinuity of the derivative of f(x-ct). 
Even at points where the derivative is discontinuous, the value off (x-ct) can be filled in 
from the fact that f(x-ct) is continuous. Hence there is a solution of the form ( 4 ) if, for 
instance, the string is displaced into a number of straight segments. In this case the 
second derivatives in ( 1 ) are zero except at the points where the slope changes discon- 
tinuously, where the second derivative does not exist, so that ( 4 ) becomes meaningless; 
but the mechanical problem still has a definite solution. 

We require forms of the two functions /(* - ct) and g(x + ct) that will correspond to 
assigned values of y and y when t = 0 . Now if initially y = <p(x) it is clear that 

(x — ct)+<j>(x + ct)} 

reduces to <f>(x) at t = 0 for all values of x, and its first derivative with regard to t is zero. 
Also \Jr(x + ct) — \Js(x—ct) = 0 at t = 0 and 

13., 

2j t W(^ + ot)-f(x-ct)} = \c{f'(x + ct) + f'(x-ct)}-+cf'(x). 

Hence if at t = 0 , y — X (x), and we take 

f'W = \x(x)> &(*) = 

Ufr(x + ct) - f(x - ct)} will contribute nothing to y at t = 0 and will give the correct value 
of y; and 

f(x + ct) - rjr(x - ct) = — x ( x ) dx. 

zc Jx-d 

The solution that makes y = <J>(x), y = X (x) at t = 0 is therefore 

1 rx+ct 

V = c ^) +^(# + c£)} +—J X (x)dx. (g) 

This is d’Alembert’s solution. It is the most general solution possible, but as it stands it 
is applicable for indefinitely large t only to a string of infinite length in both directions 
and subject to no external forces. For if the string extends from * = 0 to x = l our 
data can be only the values of y and y for t = 0 and a; = 0 to l, and for any positive t 
however small, there will be values of x between 0 and l such that x-ct is negative and 
values between 0 and l such that x + ct>l But for a string of finite length there will in 


35-2 
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general be forces at the ends, which will, for instance, keep the ends fixed. The situation 
is, therefore, that to obtain the position of the string for positive t we need values of the 
functions <j> and % outside the range originally given: but we have instead mformation 
for all time about the displacements at the ends. To make the problem precise we take 
the case where the ends are fixed; then if we can choose p> and "(Jr outside the given range so 
that y as given by (8) vanishes at x = 0 and l for all t, we satisfy the equation of motion 
within the string and also the end conditions, and therefore have a solution of the problem. 
Mathematically stated, the two functions (f> and tjr are arbitrary outside the range 0^x<l, 
and we choose them to make the solution satisfy the conditions at the ends. Physically, 
we imagine the finite string replaced by an infinite one, and choose the initial displace¬ 
ments and velocities of the latter so that the ends will not move; and we expect that within 
the range O^x^l the solution will be the same as that for a finite string constrained by 
forces at the ends that prevent the ends from moving. In the one case the forces come 
from reactions with the supports, in the other from the tensions in the outlying parts of 
the string. The conditions on <p and % are seen from (8) to be that both must be anti- 
symmetrical about both x = 0 and x = l\ that is, 

4>{-x) = -<p{x), (j>{ 2l-x) = - (pix),} ^ 

Xi~x) = - X( x )> X( 21 ~ x ) = “ Xi x H 

For fixed ends these equations give at once the complete solution. When the ends are not 
fixed, however, as when one end carries a massive particle but is not fixed, the extra¬ 
polations are much less obvious and the easiest method of solution is the operational one. 

If we take the term f(x - ct) by itself, we see that it is unaltered if we increase t by t 
and x by ct. Hence the part of the displacement represented by this term can be regarded 
as travelling with constant velocity c in the direction of increasing x; the term can there¬ 
fore be considered as representing a progressive wave. Similarly the term g(x + ct) can be 
regarded as representing a progressive wave travelling with velocity c in the direction of 

decreasing x. 


19-02. Operational solution for string with fixed ends. Let us consider in detail 
the case of a string of length l, originally drawn aside a distance rj at the point x = b, so 
that initially it lies in two straight pieces, and then released. Then if y 0 is the initial 

displacement, y 0 = yx/b 

y 0 = y(l-x)l(l-b) {b^x^l),} 

and the initial velocity is zero. Hence the subsidiary equation is 


yfiy — c 


2 &V 
dx 2 


P 2 Vo’ 


( 2 ) 


and we solve as if p was a constant, subject to the condition that y must always vanish 
at x = 0 and x = l. Operations are supposed performed on H(t) unless the contrary is 
stated. The solution is 


y = -w^ + ^sinh—sinh-(Z — 6) (0^a;<6), 

* ' b c c 

y = — 7 +ilsinh— sinh- (l — x) (b^x^l). 

y 'l — b c c 


( 3 ) 
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The constant A must be the same in both these expressions in order that y may be con¬ 
tinuous at x = b. Also a discontinuity in dy/dx at this point would imply an infinite 
acceleration, which cannot persist. There may be discontinuities of dy/dx at special 
instants, but these will give impulsive changes in velocity. We therefore choose A so that 
dy/dx will in general be continuous at x = 6; this gives 

’ ? (5 + r^6) + ^{ Cosh ^ sinh f (i-») + sinh^oosh£ (*—6)} = 0, (4) 

whence ^ __l_, oose oh^. (6) 


Then 




(6) 


This can be interpreted at once by the partial fraction rule. Taking the first expression, 
we see that if we replace p by a constant z and then make z tend to 0 the second term just 
cancels the first; hence there is no term in the interpretation independent of t. There are 
poles at zl/c = nni, where n is any integer, positive or negative, but not zero, and the 
partial fraction rule gives for 0 ^ x < b, 


_ vi 

V b(l — b) 



. . nnx. . mr(l — b) 

^ «sm-y-fcsm— -j —- 

- ; ——— gnirid/l 

nni nnic l 

—i-cosh nni 

l c 


2 V l* « 1 
b(l — b) i n 2 n* 


. nnx . nnb 
sin-y-sm-y- 


nnct 
cos—r- 


(7) 


The solution for b^x^l leads to exactly the same expression. 
Every term of this satisfies the differential equation 


0^ 

dt 2 dx 2 ' 


( 8 ) 


The separate terms can therefore be regarded as each representing an oscillation in period 
21 /tic, the displacements for all values of x varying proportionately. As for systems with 
a finite number of degrees of freedom, the partial fraction rule leads to the analysis of 
the motion into normal modes. The motion corresponding to a given harmonic factor in 
the time is called a standing wave. We see that any standing wave can be replaced by a 
pair of progressive waves; for 

2C0S ~T = sm T 27++ sm ~y~ ( x ~ ct )' (9) 


Similarly, any progressive wave can be replaced by a pair of standing waves with phases 
\n apart; thus 


• nn . nnx nnct nnx . nnct 

sm— (x + ct) = sm—y-cos—-—f- cos —— sm—. 

I l L l 


i 


( 10 ) 


The natural way to try to prove that the solution found satisfies the required conditions 
would be to differentiate term by term and substitute in (8). But in this case that method 
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does not work, because if we differentiate twice we get a series whose terms oscillate 
finitely as n increases. But if we differentiate once we get a convergent series which can 
be substituted in 19*01 (5), and the verification that 19*01 (5) is satisfied presents no 
particular difficulty. Alternatively, we can break up each term as in (9), and notice that 
the series is in this way converted into the sum of a function of x — ct and one of x + ct, 
each separately being once differentiable, and therefore it satisfies the equation of motion. 
It also satisfies the end conditions. For each term vanishes at x — 0 and x = l, and the 
series is uniformly convergent because the modulus of the general term is <n -2 , and 
therefore the sum tends to 0 as x -»• 0 or l. 

The series (7) converges too slowly to be of much use for actual calculation of the 
displacement. Another method of evaluation is as follows. We start with Bromwich’s 
integral: 

c sinhp;r/csinhp(Z —6)/c ^_cf sinh zxjc sinh z(l — b)lc ^ ^ 

~p sinh pi I c ~ 2ni) L z 2 sinh zJ/c 

where 9l(z) = k > 0 on the line L. But then | e _2 ^ /c | < 1 on L and at all points to the right 

00 

of it, and we can expand cosech zl[c in a convergent geometric series 2e-~ dlc 2 e _2ne * /c . The 

»—o 

order of integration and summation can be inverted, and we have 


- 00 /• i 

__ 2 I e&x-Wc (1 _ g-2zz/c) (1 _ g-2 Z(f-b)lc^ Q—inlzIc e zt. 

47ri n= .oJi 


= - \c 2 e p( - x ~ b)!c (1 - e~ 2pxl °) (1 - e~ 2p(f ~ b)lc ) e-* prU l c -H(t), (12) 

n-0 V 

which is the result of expanding (11) directly as if p was a constant with a positive real 
part. But 

|H( ( ) = 0 ( ( <0),| (13) 

= ct (£>0),J 

and e-v h l c ^H{t) = 0 ( ct<h ), ^ 

= ct — h ( ct>h ). 

The first term of y in (3) is rjxjb for x < 6. All terms of (12) are zero until ct = b—x, so that 

y — yxlb (0<ct<b — x). (15) 

When ct — b — x the first term in (12) begins to differ from 0; it is 

— !e-^ (6-a:)/c ct H(t) — —l(ct — b + x), (16) 

“■ d <17) 

In this stage dyjdx is the mean of rj/b and —ij/{l — b), the original slopes of the two parts 
of the string. It begins when a wave travelling with velocity c from 6 has had time to 
reach x. It will continue until the term in 2 px/c or 2p(l — b)(c no longer vanishes. The 
former yields the expression 

^ e -pQ)+x)/c c t j rjfy — £(c£ — b — x) (b + x< ct), (18) 
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and, adding this to the previous contributio ns , we find 

TJX 


y = - 


l-b' 


551 


(19) 



The beginning of this stage corresponds to the time it would take a wave to travel from 
b to 0 and be reflected back to x. The part of the string reached by this reflected wave is 
therefore parallel to the original position of the part where b<x<l. 

When ct — (b—x) + 2(1 — 6) a wave reflected at x = l arrives, 
and afterwards 

y = ^T){(2"z) a; + ^ + 6) “ Z }' 

This holds until ct = b + x + 2(1 — 6); in the next stage 


< 21 > 

When ct — 21, the whole of the string is back in its original 
position; the term in e~^ c then begins to affect the motion, 
and the whole process repeats itself. We see that at any 
instant the string is in three straight pieces. The two end 
pieces are parallel to the two portions of the string in its 
original position, and are at rest. For the middle portion 
the gradient dyjdx is the mean of those for the end portions, 

and the transverse velocity is ± . The middle portion 

is always either extending or withdrawing at each end with 
velocity c. 

The partial fraction rule and the expansion in negative ex¬ 
ponentials are alternative ways of evaluating the Bromwich 
integral. The former in general analyses it into normal modes, 
the latter into progressive waves. In the problem just con¬ 
sidered the exact periodicity enables the wave expansion 

to give a solution in finite terms for any value of the time, and it is therefore decidedly 
the more useful form. It is not, however, a general rule that the solution of a problem 
m small oscillations is strictly periodic, and the evaluation of the successive waves may 
become laborious if the motion is required after a long time. 

19-03. Solution for a general initial disturbance. We take v = <f>(x) *>= o at 
t — 0\ then the subsidiary equation is 







a) 


»nd we want a solution that vanishes at * = 0 and L Using the method of variation of 
parameters, we assume that the solution is 


y = A cosh— + B sinh—, 
c c 


(2) 
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where A and B are functions of x subject to 
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A' cosh— + B' sinh— = 0. 
c c 

Substituting in (1) we find , vx . px . 

A' s inh — + B' cosh— = ~~<p{x). 
c c c 

Hence A ' = ^ 6{x) sinh^, B' = -~<j>{x) cosh^. 

C C C G 


(3) 

(4) 

(5) 


For x ~ 0, y - ^4(0); hence, since y = 0 at x - 0, 

A = r?95(f)sinh^di. 
Jo c c 

Also, since y = 0 at * = 1, B(i)sinh ^ = _J(l)ooeh^, 


which gives B(l) \ and then B(x) is determined since we have B from (5). We have 
B{x) = — coth — f l ?0(g)sinh + f^(g)oosh^dg. 

C J o C C Jx c </ 

Substituting for .4 and B in (2) we have 




By the partial fraction rule 

sinhff(Z —s)/ccoshpg/c _ Z-s_ 0 " _l sin ^ coa ?^ COB ?^? f 
sinh pZ/c l n=i n7T III 

sinh pxjc coshff(Z-g)/c = * 2 2. J_ gin ^ cos ^ cos ^. 

sinh pZ/c Z n =i?wr III 


The first terms are independent of £ and contribute nothing to the integrals; and 


ri / 00 1 

y = - 2 <£(£)<*( S —si 

J|=0 \n=l rw ' 


W7raj wrrcZ n7rf\ 

sm— y- COS—y— COS-y— I . 


(6) 

(7) 

(8) 
(9) 


( 10 ) 


The series represents a function with finite discontinuities and a Stieltjes integral is 
required. 

If we invert the order of integration and summation and then differentiate cos nn£/l 
in the separate terms, we get 


O oo 1*1 

y = yS 0(£)si 

*n=lj 0 


nnx . nn£ nnct 
sin-=— sin—- cos—=— at,. 
Ill 


( 11 ) 


This is Fourier’s solution; putting t = 0 we get the sine series 


<p{x) 


JV(£) sin^p sin 


( 12 ) 
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To get a wave expansion we have 

skihp(l — a)/c cosh pg/c = i e - p(x ^ /c ^ _ e -2 P (i-x)/c) (j + e - 2 vgiO) f; e -2^/c (g < ^ (13) 

sinhpi/c n =0 

smhga;/c c o s h|)(Z -jjtyc = , p{g _ x)jc ^ _ e - 2 px/ C ) ^ + e - 2 p(i-&/c) £ e -2np//c (g > *.). (14) 

smhpZ/c «,= 0 

All the exponents are negative multiples of p; this always happens. Also 

e -p(x-£)/c (! _ e -2p(i-x)/c) (! + e -2 H(t) = 


+ H(t--II (t- — - ° j, (15) 


which is constant except for jumps of ± 1 when the arguments of the unit functions pass 
through 0. Then the terms arising from n — 0 give 


= %{(j){x — ct)—(f){2l — x — ct) —<fi(ct—x) + <f)(ct + x— 21)} 

+ l{<fi(ct + x)—<fi(ct — x)—<p(2l — x—ct)+<f)(2l + x — ct)}, (16) 


where only those terms are to be taken such that the values of £ that make the arguments 
of the corresponding unit functions vanish he within the ranges of the respective integrals. 
The first term in the first line is seen to represent the direct wave from points between O 
and x, the third, which begins at time xjc, the wave reflected at x — 0, and the other two 
the reflexions of these at x = l. Corresponding relations hold for the terms in the second 
line except that they give the contributions from waves starting between x and l. It will 
be seen that the solution holds up to time 2Ijc, by which time all the terms have disappeared 
on account of their arguments passing out of the ranges permitted; but then the terms 
from n — 1 in (13) enter and repeat the entire motion. At time 4l/c they also have all 
disappeared but those from n— 2 enter, and so on indefinitely. 

In the foregoing cases the motion repeats itself exactly at regular intervals. In the 
following it does not. 


19*04. A uniform heavy string of length 21 is fixed at the ends. A particle of mass m is 
attached to the middle of the string. Initially the string is straight and under tension P. A 
transverse impulse J is given to the particle. Find the subsequent motion of the particle.* 
We take x zero at the middle of the string. By symmetry we need consider only the range 
of values 0 < x ^ l. Call the displacement of the particle rj. When t = 0, y and dyjdt are 
zero except at x = 0. The subsidiary equation for the string therefore needs no additional 
terms for the initial conditions. The conditions that when x = 0, y = rj, and when x — l, 
y = 0, give, therefore, 


y = v 


sinh p(l — x)/c 
sinh pljc 


(1) 


* Cf. Rayleigh, Theory of Sound, 1, 1894, 204. 
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The equation of motion 
on both sides of it, is 


Loaded string 

of the particle, taking account of the 



19*04 

equal tensions in the string 

( 2 ) 


At t — 0, 7) = 0, mi} — J. Hence the subsidiary equation for rj is 


^ =2 p (!L„ + * j 


and therefore 


= — 2Ptj - coth pljc +pJ, 
c 

Jc 

i) — —— -— 

mpc 4- 2 P coth pljc * 


If p is the line density of the string, P = pc 2 , and the mass of the string is 2 pi. Put 


(3) 

(4) 


W = k 2p — kmc * 
m ’ l * 

Then v = lJ l mc 

(pl/c) + k Qoth.pl jc ’ 


(5) 

( 6 ) 


To interpret by the partial fraction rule, we recollect that the system is a stable one 
without dissipation, and therefore all zeros of the denominator are purely imaginary. 
With pile = ia), (0 satisfies 

(o — k coto). (7) 

There is a root between every two consecutive multiples of it, positive or negative, and 
the roots occur in pairs of equal magnitude. Then 


77 = — 2 --- e i<oct/l 

me io)c( 1 + k cosec 2 co) (IJc) 

2Z J v 1 . wet 

=- 2 j — -=—r sm—, 

me u)(l +k cosec 2 (o) l 

the second summation being only over positive values of (o. 

If a root of (7) is m t + A, where A < n, 

{nn + A) tan A — k, 

and A = k/nn approximately. Then the series converges like 2 n~ s . Four or five terms 
should therefore be enough to give 1 % accuracy. For higher accuracy the labour would 
be great. 

For t not too great an exact solution can be found easily by the wave expansion. It is 
convenient to change the unit of time to Ijc, the time taken for a wave to travel half the 
length of the string. We also replace Jjm by F. Then 


( 8 ) 

(9) 


V = 


= V 


p + kcothp 
1 - e~ 2 p 


7(l_ e -2p) 


(p + k) — (p — 1c) e~ 2p 
p — k\ 2 


1+ £r*^,+ 


p+k { p+k 




2k 


p+k\ p+k 


9-2 p _ 


2 k{p — k) 
{p + k) 2 




( 10 ) 
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The first term is zero for t < 0, and for t > 0 it is equal to 


■jr{l-e~ u ) (t> 0). 


After t = 2 the second term no longer vanishes. We have 

k Ip p 11.. 

(p + k ) 2 k k(p + k) (p + k) 2 k k 

and the second term therefore* contributes 


—rO-* 


: _ bit. __ 


k(t- 2)e”*»- 3 )} (t > 2). 


The third term is zero for t< 4; for f > 4 it is easily found to be 

2V 

-jr- [1 — {1 + k(t — 4) + k 2 (t — 4) 2 }e -w_4) ]. (14) 

The process may be extended to determine the motion up to any time desired. The entry 
of a new term into the solution corresponds to the arrival of a new pair of waves reflected 
at the ends. 

19*05. A uniform heavy bar is hanging vertically from one end, and a mass m is suddenly 
attached to the lower end. Find how the tension at the upper end varies with the time.* 

For a light bar it is easy to see that the added mass will perform harmonic oscillations 
about the position of equilibrium; when it reaches its lowest position the extra tension 
is therefore twice the weight. This feature accounts for the danger of suddenly attaching 
a load that a system might be well able to support if the load was added gradually. 

For a heavy bar, if a; be the distance from the upper end, y the longitudinal displacement, 
y satisfies M 

P dt 2 E dx 2 ~ F ' f 1 ) 


where p is the density, E Young’s modulus, and F the external force per unit volume, in 
this case pg. Put E/p = c 2 , and let the displacement of a particle under the tension 
before the weight is attached be y 0 . Then 

- C ^ = 3 - ( 2 ) 

When x = 0, y 0 = 0; and when x = l, the length of the bar, dyjdx = 0. Hence 

glx l 1 x\ 

= (») 

After the weight is attached we still have 

at 1 dx‘ 9 dx* ’ W 

and when t = 0, y = y 0 , y = 0. Hence the subsidiary equation is 

16 ) 


* Love, Elasticity, § 283. 
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and the solution that vanishes with x is given by 

y-y Q = A Bmhpxjc, 

where A is independent of x. 

If m is the cross section of the bar, the equation of motion of the mass m is 

d 2 y dy 

the derivatives being evaluated at x = l. The subsidiary equation is 

mp 2 y = mg + mp 2 y 0 - Em , 
and on substitution for y from (6) 

ip 2 sinh — + cosh—) A = g. 

\ c me c J 


Emg 


The tensile stress at the upper end is 
. EpA _ . 

fl'P + c 9P + ]$ w cosh pi jc + mcpsmhpl/c' 
If k is the ratio of the mass of the weight to that of the bar, 

k = m/pml, me I Em = kl/c, 


and the stress is 


gjnTl 


+ - 


1 


1 


1905 

( 6 ) 

0 ) 

( 8 ) 

(9) 

( 10 ) 

( 11 ) 

( 12 ) 


m | Jc cosh pljc + k(pljc) sinhpl/cj * 

We see that gmjmk is the stress due to the weight of the bar alone, and gmjm is the 
statical stress due to the added load. To evaluate the actual stress we expand the 
operator in powers of e~ pl/c . We take l[c for the new unit of time; then 

1 2e~ p 


kp sinhp + coshp (kp +1) — (kp — 1) e~ 2p 

2e~ p 


kp + 


} r 

I _ 


l + T~ Zl e_2i> + 
kp + l 




The first term vanishes to time unity, and afterwards is equal to 

2(l- e -«-i>/fc). 

This increases steadily up to time 3, when the next term enters. Again, 


(14) 


kp — 1 


= -l + 


kp 


+ - 


2kp 


(kp+l) 2 ~ 'kp+l ' (kp+l) 2 

= — 1 + e~ t/k + 2 (tjk) e~ t/k , 

and the first two terms of (13), for £>3, are equal to 

2e-«-3)/fc{i + (2 jk) (t- 3) - e -2/fc }. 


This has a maximum when 
This has a root less than 5 if 


l + e -2/* = -(*-3). 
4 Ik >1 + e -2/ *, 


(15) 

(16) 

(17) 

(18) 
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which is an equality if h — 2-7. Thus if k = 1 or 2 the maximum stress will occur before 
t = 6. If k = 1 it is when t = 3-568, and is equal to 3'266gmfvj, that is, 1-633 times the 
statical stress. If h = 2 the corresponding results are t = 4-368, 2-520 gm/m, and 1-680 
times the statical stress. 

The third term enters at t = 5, and afterwards is equal to 

2[1 - e-d-m _ 2{(t - 5 )/k} 2 e -«- s >/*]. (19) 

If k = 4, the maximum stress is when t = 6-183, and is equal to 2-29 gm/m. The statical 
stress is l’25gmjm, so that the ratio is 1-83. 

This solution and that of 19-04 are due to Bromwich. 


19*06. Periodic disturbance at an internal point. It is sometimes argued that if 
a periodic motion is enforced at an internal point of a system of finite size, it represents a 
continual supply of energy, and the disturbance will ultimately exceed any bound, 
irrespective of any question of resonance.* It is interesting to examine what will actually 
happen in a simple case where these conditions are satisfied. First, consider a string of 
length 21, originally at rest, and suppose that for t > 0 the middle point is made to vibrate 
harmonically, the displacement being sin nt. The operational solution expressing this 
displacement at x = 0 and zero displacement at x = l is 


_ np sinh. p(l — x)/c 
^ p 2 + n 2 sinh pljc 

= e~ px/c ( 1 - e-ivd-x)ic) ( i + e -2pi/c + Q—ipi/c + ...) s i n n t H(t) 


= smw 


/, x\ . (. 2 l-x\ . ( 2 l + x\ . I 41 — x\ 

-- j +sm»|i- —j-smn{t -—j + (2) 


where we are to include only those terms whose arguments are positive. At any ins tant 
the motion therefore consists of a number of superposed harmonic waves of the same 
period, their number increasing indefinitely with the time. At first sight this suggests 
that the disturbance may grow indefinitely; on the other hand it is possible that this may 
be prevented by successive waves interfering. This can be tested by evaluating (1) by 
the partial fraction rule. The pole at in contributes 


in2 S nn ( l -*)l c c ir* _ 
in . 2in sin nl/c 2 sin nljc 1 


and with that at — in gives 


sm n(l—x) c . 

- . ■ sm nt. 

sm.nl c 


(3) 

(4) 


The pole at pljc — rin (r an integer) gives 


nrinc/l i sin rn(l — x) /l . 


n 2 — r 2 n 2 c 2 /l 2 rin cos rn 
and altogether 


e ri * d/l = i -- UC J\ 2/72 (-1 ) r sin erivrn 
w- — r 2 n 2 c 2 jl 2 l ’ 


sm n(l-x) c . ® . 1X _ nc l . rn(l-x) . met 

y = -^- T ~-smn<~2 2 (-l r ~i-r Vp 7T P s m—S —'sin——. 

sm nljc r=1 n 2 — r 2 n 2 c 2 ll 2 l l 


(5) 

( 6 ) 


* Cf. H. M. Macdonald, Proc. Roy. Soc. A, 98, 1921, 409^11. 
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The series is absolutely convergent so long as n is not an exact multiple of ncjl, and its 
sum for any x and t is less than the result of replacing the sines by 1 and taking the moduli 
of all the terms. Hence y does not increase beyond limit. If we take the kinetic or the 
potential energy we again get a series that converges like St* - 2 , and the energy never 
passes a certain value. The rate of supply of energy is in fact proportional to the product 

| evaluated at a; = 0, and one factor consists of sines, the other of cosines, ofmultiples 


@)©' 


of the time. The rate of supply of energy is sometimes positive, sometimes negative. 
The mean rate of supply of energy over a long time tends to zero. 

In the problem just treated the displacement is prescribed to vary finitely at one point, 
and it might be thought that it is this condition that prevents indefinite growth at any 
point. We therefore consider also the case where the transverse force, not the displacement, 
is prescribed to be sinnt for t > 0. Denoting the displacement at x = 0 by y 0 , we have now 


y = Vo 


awhp(l — x)/c 
sinhpZ/c ’ 


sin nt 


np 

p 2 +n 2 



y 0 - 2 P coth pljc, 
c 


(7) 

( 8 ) 


, _ nc sinh p(l — x)/c 

an ^ 2P(p 2 + n 2 ) cosh pljc 


(9) 


= e~ px/c (l - e~ 2 P9-x)ie) (x _ e -zpi/e + e -4 P i/c _...) (i _ cos nt) 

= 2k[{ 1 - COSM ( J -f)}-{ 1 -° osm ( ( - ? T 5 )}-{ 1 - OOSB ( < - ? T^)}+-]- 

( 10 ) 

where again only the terms with positive arguments are to be included. If we evaluate 
(9) by the partial fraction rule we get 


V = 


2 Pn 


sin nt 


sin n(l-x)/c ' * nc , , s nx . , ^nct 

( 11 ) 


The series in this case converges like 2 r~ 3 and again there is an upper bound to the dis¬ 
placement. The force at x = 0 is now sinnt and the velocity is a series of cosines, so that 
again after a long time as much energy comes to be taken out as is put in. 

The system is supposed to be of finite extent. If we make l tend to infinity we get an 
infinitely long string, which is hardly a practical possibility, but the same analysis would 
apply fairly well to a gas in a long tube open at both ends. Simply letting l increase we 
see that the waves arising from terms with arguments containing l take longer and longer 
to return, and the solution for any given x will reduce to the first term so long as 2 l—x< ct, 
and therefore for a longer range of time the larger l is. Proceeding to the limit we see that 
the disturbance consists of a wave of fixed amplitude extending for a greater and greater 
distance from the origin; the energy therefore does increase indefinitely. It is therefore 
possible for a finite force to produce an indefinitely large amount of energy in an infinite 
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system provided that it is made to act long enough. It will be noticed that in this case the 
operational solution reduces to the first term in the wave expansion and could be found 
by writing the subsidiary equation as 


p*y-v 


d*y 

dx % 


= 0 , 


and taking the solution as Ae~ px/C , rejecting the solution BeP x/c on the ground that it 
would represent a wave travelling inwards and therefore a’ source of energy at a large 
distance. 

It has been supposed in (6) that n is not an exact multiple of zrc/Z, in (11) that n is not of 
the form (s + £) zrc/Z, where a is an integer. In the special cases where this restriction is not 
satisfied we have resonance and the disturbance will grow. The modification of the solu¬ 
tion to take account of the double poles is straightforward, but not of any special interest 
since indefinite increase of the disturbance in the case of resonance is formally possible 
even for a system with one degree of freedom. The main conclusion is that a harmonic 
force of limited amount will never impart more than a given amount of energy to a system, 
however long it acts, unless either the system is of infinite extent or the period of the force 
agrees exactly with a free period of the system. In the latter case the solution will ulti¬ 
mately need modification to take account of neglected higher powers of the displacement. 


19*07. Problems of spherical symmetry. The equation of propagation of sound 
in three dimensions is 



where <j> is the velocity potential; the velocity components are 


and the pressure is 


Ui 


00 

dx<’ 



(2) 

(3) 


(Capital P is used because we want p for the Heaviside operator.) Now if 0 is a function 
of r and t only, where r is the distance from a fixed point, 


9 r 2 dr\ dr) 


0 2 0 2 00 1 0 2 

dr 2 ^ r dr r d r 2 * 


Hence 


0 2 02 
= 0^2 W’ 


(4) 

(5) 


and this has the same form as the equation of vibration of a string or transmission of sound 
in one dimension, the dependent variable being now r0. Hence a general solution is 

r<^> =f(r-ct) + g{r + ct). (6) 

The first term will represent a disturbance travelling outwards, the second one travelling 
inwards. 
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19*08. Explosion wave. Consider a spherical region of high pressure, surrounded 
by an infinitely extended region of uniform pressure. The boundary between them is 
solid, and the whole is at rest. Suddenly the boundary is annihilated; find the subsequent 
motion. We suppose the motion small enough for squares of the displacements to be 
neglected. At all points, for t > 0,19-07 (1) holds. We take P to be the excess of the pressure 
above the undisturbed pressure outside the sphere. As there is initially no motion, <j> is 
constant everywhere, and may be taken as zero. The excess pressure is given by 

P = -pd<f>/dt. (1) 


This is initially a positive constant P 0 when r < a, and 0 when r > a. Then we can take the 
subsidiary equations to be 

(Is- (r<a) 

= 0 (r> a). 


( 2 ) 


The pressure must remain finite at the centre, and the disturbance for r > a cannot include 
any wave travelling inwards. Then 

r( j)=z—?^. + A sinh— (r<a),l 

pp c l (3) 

= Be~ pr!c (r>a). j 


The pressure and the radial velocity must be continuous at r = a; hence <j> and d<p/dr must 
be continuous. These give 

-5®+isinh^ = Be~ pa l c , (4) 

PP c 

—— + -A cosh— = — -Be~v alc t (5) 

pp c c c 

whence A = Sl ( c + ap) e~ pa/c , B = (c — ap) e patc —(c + ap) e~ paie . (6) 

Thus outside the original sphere 

= [^2 ( c ” a P) e~ p(x ~ a),e ~ 2^2 ( c + a P) e-^ r + o)/c J H(t). (7 ) 

The associated pressure change is 

P = _g[g_ a ) e -p(r-a)ic _^ + a j e _3J(r+a)/c J H (t) 

= - 3 [ e - 2 )(r-a)/c ( ct _ a ) _ e -p(r+a)lc ( ci + a )] ft (£) 

For given r>a, therefore, the pressure change is zero up to time (r—a)jc, when the first 
wave from the compressed region arrives, and after ( r+a)/c , when the wave from the 
most distant point passes. At intermediate times it is equal to P Q (r — ct)]2r. This is equal to 
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P 0 af2r just after the first wave arrives, —P 0 a/2r when the last passes, and varies linearly 
with the time in between. The compression in front of the shock is associated with an 
equal rarefaction in the rear.* 

Within the sphere the pressure is 

This is equal to P 0 up to time (a — r)jc, then drops suddenly to P 0 (l —a/2r), decreases 
linearly with the time till it reaches —P 0 a/2r at time ( a + r)/c , and then rises suddenly 
to zero. The infinity in the pressure at the centre is only instantaneous, for the time the 
disturbance lasts at a given place is 2r/c, which vanishes at the centre. It is due to the 
simultaneous arrival of elementary waves from all points on the surface; at other points 
the waves from different parts arrive at different times, giving a finite disturbance of 
pressure over a non-zero interval. If r < the pressure becomes negative immediately 
on the arrival of the disturbance. Strictly, the occurrence of an infinity in the solution 
means that squares of the disturbances cannot be neglected within a certain range of 
r and t; but this range will be smaller the smaller P 0 is. 

The behaviour of the velocity at distant points is similar to that of the pressure. If 
u is the radial velocity, 

u = d<j)jdr — 

If r is great the first term is simply pP/c, and its behaviour is inferred immediately. It 
is proportional to 1 jr, the second to 1 /r 2 . The first term gives no total outward displacement, 
the outward movement during the stage of increased pressure being just cancelled by the 
inward movement during the stage of decreased pressure. The second term, however, 
gives a small velocity which vanishes at the beginning and end of the shock, and reaches 
a positive maximum at time r/c. It produces a total radial displacement of order ajr 
times the maximum given by the first term; this represents the fact that the matter 
originally compressed expands till it reaches normal pressure, and the surrounding matter 
moves outwards to make room for it. 

The corresponding problem in one dimension would be that of an excess pressure JF}, 
within a length 2a of an infinite tube and suddenly released. Two waves of excess pressure 
\P Q would travel in opposite directions; the reduction of pressure in the rear of the dis¬ 
turbance that we have found in three dimensions has no counterpart in one. We shall 
see when we deal with applications of Bessel functions that it has a very striking counter¬ 
part in two dimensions. 

Disturbances that do not affect a given place until some definite instant are often con¬ 
veniently called pulses. The sound wave from an explosion is an example; so are flashes 
of light from a rotating mirror and the elastic waves sent out by an earthquake. 

19*09. Diverging waves produced by a sphere oscillating radially, f Suppose 
that a sphere of radius a begins at time 0 to oscillate radially in period 2njn. We require 
the motion of the air outside it. 

Initially all is at rest; hence 

r(j> = Ae~ prlc . (1) 

* Cf. Stokes, Phil. Mag. 34, 1849, 52. 

t Love. Proc. Lond. Math. Soc. (2) 2, 1904, 88; Bromwich, ibid. (2) 15, 1916, 431. 
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562 Examples 

When r = a the outward displacement is, say, - sin nt when t > 0, and the outward velocity 


coant. Hence 


p 2 a 2 


* °- (^ + J)a + Wc) exp !~o (r - 0) ) 

= "?T^ eXp l _ f'&■-«)} {oo.«*+^rfn**-«cp 

c 2 a 2 r /. r —a\ na . [ r-a\ ( ct-r + aY] 

=-^T^L cos H f '~) + ^ smn r~)' exp l——)> 


(2) 


(3) 


when ct>r—a. 

The solution has a periodic part with a period equal to that of the given disturbance, 
together with a part dying down with the time at a rate independent of n, but involving 
the size of the sphere. As there is no corresponding term in the problem of the explosion 
we may regard it as the result of the constraint introduced by the prescription of a definite 
motion of the sphere. Its effect on the velocity or the pressure is to that of the second term 
in a ratio comparable with (c/na) 2 . 


EXAMPLES 

1. A string of length 31 and line density p is under a tension P = pc 2 and fixed at its ends. Two 
particles of mass m are attached to the points of trisection. A transverse impulse J is given to one 
particle. Show that the operational solution for the displacement of the other particle is 

PJ sinh pl/c 

c (wpsinhpZ/c + 2P/ccoahpZ/c) a — P 2 /c 2 * 

and find the explicit solution up to time 31/c. 

2. A heavy uniform string of length 31 and line density p is fixed at the ends, and a particle of mass 
m is attached at a distance 1 from one end. The tension is pc 2 . A transverse velocity v is given to the 
particle. Show that the displacement of the particle is 

mv sinh pile sinh 2pl/c 

7 } = -;- 

mp sinh pl/c sinh 2 plje+pc sinh 3 pl/c 

and evaluate ij up to time 41/c. (M.T. Sched. B, 1927.) 

3. A closed pipe of length 1 contains air whose density is slightly greater than that of the outside 
air in the ratio 1 +s 0 :1. Everything being at rest, the disk closing one end of the pipe is suddenly 
drawn aside. Show that after a time t the velocity potential is 

, 8 lcs 0 * (-l) r (2r+l)7ra; . (2r+l)irct 

Y n 2 r== o(2r+l) 2 21 21 

the origin being taken at the permanently closed end and c being the velocity 

4. Find the motion produced in the conditions of Ex. 3 except that the 
instead of a cylinder. 


of sound. 

(M/c, Part m, 1932.) 
pipe is a narrow cone 




Chapter 20 

CONDUCTION OF HEAT IN ONE AND THREE DIMENSIONS 


Then cold, and hot, and moist, and dry. 

In order to their stations leap. 

john dryden, Song for St Cecilia's day 


20*01. Equation of heat conduction. The rate of transmission of heat across a 
surface by conduction is equal to — kdVfdn per unit area, where F is the temperature, 
k a constant of the material called the thermal conductivity, and dn an element of the 
normal to the surface. Hence we can show easily that in a uniform material the rate of 
flow of heat into an element of volume dxdydz is &V 2 F dxdydz. But the quantity of heat 
required to produce a rise of temperature dV in unit mass is cdV, where c is the specific 
heat, and therefore that needed to produce a rise dV in unit volume is pcdV, where p is 
the density.* Hence F satisfies the equation 

~(pcV) = iV*F. (1) 

If we put kjpc — A 2 , (2) 

A 2 is called the thermometric conductivity, and the equation becomes 

dv 

ii -<»> 


In addition there may be some internal source of heat. If this would raise the tem¬ 
perature by P per unit time if it stayed where it was generated, a term P must be added 
to the right of (3). Chemical and radioactive changes are the chief producers of heat 
at internal points. 

In applying the operational method of solution it is usually convenient to write h 2 q % 
for p. The operational solutions are then functions of q; but q must be expressed again in 
terms of p before interpreting. 


20*02. Rod cooled at one end. Consider first a uniform rod, with its sides thermally 
insulated, and initially at temperature S. At time 0 the end x = 0 is cooled to temperature 
zero, and afterwards maintained at that temperature. The end x = l is kept at tem¬ 
perature S. Find the variation of temperature at other points of the rod. 

The problem being one-dimensional, the equation of heat conduction is 




( 1 ) 


while at time 0, V ~ S. Hence the subsidiary equation is 

0 2 F 


or 


0 2 F 

^2 — (7 2 F = —q 2 S. 


( 2 ) 

(3) 


* Thermodynamic effects require some modification of this statement, since there will be in general 
a thermal expansion, and some of the heat is used in doing work against the pressure of the surro unding 
material. The correction required is serious for a gas, but not important for a solid or liquid. Cf. Jeffreys, 
Cartesian Tensors, Chapter 8, or Proe. Camb. Phil. Soc. 26, 1930, 101-6. 


36-2 




564 Rod cooled at one end 

The end conditions are that V = 0 at a; = 0, t>0; V = 8 at x — l. Hence 


2002 


\ smh ql ) 


(4) 


The operator is an even function of q and therefore a single-valued function of p. The 
poles are where ql — ± inn, that is, p = — h 2 n 2 n 2 /l 2 , where n is any integer. But the 
negative values of $(q) give the same values of p as the positive ones, and therefore when 
we apply the partial fraction rule we need consider only the positive and zero values. 
The part arising from p = 0 is 



(5) 


The general term is 

_ a _ sinhinn(l-x)ll _ w = 

{ — h 2 n 2 n 2 jl 2 ) (cosh inn) (l 2 /2h 2 inn) 


S — sin 
nn 


nnx 

~T 


g—n 2 7r*A 2 </i s 


( 6 ) 


and the complete solution is 

V = 


f ' 


8 ~ + 


" 2 . nnx , 

T. —sm —r- e~ nnht ! n 
i=inn l 




(7) 


If we use the Bromwich integral we have 


V = 


Af A ! sinh ^ l ~ x ) \ dz 

2niJ L \ sinh^i Jz ’ 


( 8 ) 


where h% 2 = z. The integrand is a single-valued function of z with poles at 0 and — n 2 h 2 n 2 /l 2 . 
It is immaterial whether we define £ to be real and positive or real and negative when z 
is real and positive. We take it positive. The path 
can be deformed as shown; and then into a loop 
about z = 0 and small circles about all the negative 
poles. The residue at 0 is x[l. For the negative 
poles, £ = + inn/l, and 

2 ^sinh£Z = |£-^sinh£Z = i£Z cosh £Z = \inn{-l) n . 

Hence the residues are 

nhinn(l-x)ll 2_ nnx ,„ w 
nn l 



The values £ = — inn/l do not arise, since at the negative poles arg £ = | arg z — \ni 
when the path is taken as shown. Hence we recover (7). The path might also be taken in 
either of the following two ways: 



It is easy to show that either of these leads to the same result. 
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The series (7) is rapidly convergent if nhffljl is moderate; if it is 1 the second exponential 
in the sum is e -4 = 0*018, and the next e~ 9 = 0*0001. At this stage there is clearly consider¬ 
able cooling for half the length of the rod. 

If nht^jl is small the convergence is slow. In this case we can adopt a form of the 
expansion method applied to waves.* We write (4) in the form 

V = £[1 - e~ qx ( 1 - e- a a0-*)) (l + e -W + e -4a* +...)]. ( 9 ) 

For if we interpret this as an integral along the path L, the argument of £ is between ± \tt 
at all points of the path, and the series converges uniformly and absolutely. Integration 
term by term is therefore justifiable, and we may interpret term by term. Now 



qx = xp ll2 jh, 

(10) 

and by 12*126 (19) 

erf 2Wh (OO). 

(11) 

Hence V — 8 


(12) 


When w is great, 1 - erf w is small compared with e~ w \ If then x/2ht 1 '* is moderate, but 
1/2M 11 * large, this series is rapidly convergent, and can in most cases be reduced to its 
first term. This solution is therefore convenient whenever (7) is not. 

The separate terms of (9) do appear to depend on which sign we take for £ when z is real 
and positive, whereas the original integral does not. But if we took the negative sign the 
series would diverge on L. There would, however, be a convergent expansion on L with 
positive signs in the exponents. But since 9t(£) is now negative on L we are led again to 
precisely the same series. The choice of the positive sign for 9ft(£) is convenient, but the 
negative sign would lead to the same answers if the work is done correctly. 

The temperature gradient at x = 0 is, for large l, 

S \dx^ 1 ~ erQi > 1 .- Sq= h y /(nty ( 13 ) 

This equation played an important part in Kelvin’s estimate of the age of the Earth. 
Neglecting the curvature of the Earth, he treated the cooling problem as one of one¬ 
dimensional flow, 8 being the melting point; then measures of the temperature gradient 
at the surface led by (13) to an estimate of t. Knowledge not available in his time has led 
to considerable change in the result. 


20*03. One-dimensional flow of heat in a region infinite in both directions. 

First suppose that at tune 0 the distribution of temperature is given by 


V = H(x). 

We have seen that the function 

(1) 

- 1-erf^ (|>0) 

(2) 

satisfies the differential equation ~—h 2 — = 0 

Si 3^^ 

(3) 


* Heaviside, Electromagnetic Theory , 2, 69-79, 287-8. 
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Semi-infinite region 

for positive values of x; and it will also satisfy it for negative x, since both terms in (3) 
are odd functions of x. Also when t tends to zero this function tends to zero for all positive 
values of x, and to 2 for all negative values. It follows that the function 

*(2-e-) = |[l + erf^] (4) 

satisfies the differential equation for all values of x and all positive t; and when t tends to 
zero it tends to 0 for negative x and to 1 for positive x; and therefore to H(x). Hence this 
function gives the solution for positive t if V — H(x) when t — 0. 

Suppose now that the initial distribution of temperature is 

V=f(x)=r M)dH{i-x). (5) 

JS -00 

The solution that reduces to H(£, — x) when t = 0 is 


s[ i+erf M- < e > 

Hence, by the principle of superposition, the solution of the more general problem is 



= * f°° /YA e -<£-*>Wf 

(7) 

Put now 

£ = x + 2hf l2 X; 

(8) 


1 f 00 


then 

F = _ J f(x + 2ht 1/a A) e~ x dX. 

(9) 


This is the general solution obtained by Fourier. 


20*04. Imperfect cooling at the free end of a one-dimensional region. With 
the initial conditions of 20*02, let us suppose that the end x = l is maintained at tem¬ 
perature S as before, but that the end x = 0 is not effectively cooled to temperature 0. 
Instead we suppose that it radiates away heat at a rate proportional to its temperature. 
At the same time heat is conducted to the end at a rate JcdVjdx per unit area. These effects 
must balance if the temperature at the surface is to vary continuously, so that instead 
of having V = 0 at the end as before we shall have a relation of the form 

|^-aF = 0 (x = 0). (1) 

The operational solution is again 

F = <S{1— .dsinhg^Z — x)}H(t), (2) 

where A has now to be determined to satisfy (1); then 


and 


qA cosh ql — a(l — A sinh ql) = 0, 


ttsinhgQ-*) - W 

q cosh ql + a sinh ql) 


(3) 

(4) 
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The roots in p are real and negative, and we can proceed to an interpretation in partial 
fractions as usual. Or, using the expansion in ‘waves’, we have 


V = 


L q+* 


n- e -w-x)\ ( i _ ? —— e ~ 2 ^ 1 + 

\ q + a 



(5) 


If the length is great enough to make the terms involving e -2aZ inappreciable, we can reduce 
this to its first two terms, thus 


V = 



( 6 ) 


If a is great, the solution reduces to that of 20*02: this is to be expected, for (1) then 
implies that V = 0 when x = 0, which is the boundary condition adopted in 20*02. If a 
is small, V reduces to S; the reason is that this implies that there is no loss of heat from the 
end, and therefore the temperature does not change anywhere. For intermediate values 
of a we proceed as follows. If 


Bromwich’s rule gives 


y = 


ae~Q x 
q + a 


m. 


y = 


If ah 

2 m J L z 1/a + ah 


Mr)* 


(7) 

( 8 ) 


Put z =s £ 2 . The path for £ is a curve from Re~ llini to Re lli7ri , where R is great, passing the 
origin on the positive side. Denote this path by N. Then 


But 


-sJ w (rFfs 

= = 1 —erf 




2 ht 1 !* 


(<> 0 ). 


( 9 ) 


( 10 ) 


For the second integral in (9) we put £+ ah = /i; and this part of y is 

— \ J ^ ex P ”/* (exp {ofihH 4- ax) d/i 

= ~ exp (a 2 h 2 t + ax) ^1 - erf ^^ ^. (11) 

Hence y — 1 ~ erf £—exp (y 2 ax) {1 — erf (£+y)}, (12) 

where £ = xj2hf h , y — aht k \ and 

V = $[erf£ +exp(y 2 + aa;){l — erf(£ + y)}]. (13) 

This is the same as Riemann’s solution.* 

The temperature at x = 0 is S exp y 2 (1 — erf y), whence the gradient at the end follows 
by (1). For t small this has a convergent expansion in negative powers of t and therefore 


* Riemann-Weber, Partielle Di^fferentiaZgleichungen, 2, 1912, 96-8. 
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Ingen-Hausz’s experiment 


falls continuously. The temperature fall at x = 0 is not instantaneous, nor the gradient 
momentarily infinite, as in 20*02. For great values of t we can use the asymptotic expan¬ 
sion for erfy: 




8 


ah^int) 


H 


1.3/ 1 V 
2 a 2 h 2 t ”^2.2 \<x 2 hH) 




(14) 


This is equivalent to one found by Heaviside.* 


20*05. A long rod is fastened at the end x = 0, the other end x = l being free. Initially 
it is at temperature 0, but at time 0 the clamped end is raised to temperature 8 and kept 
there. Each part of the rod loses heat by radiation and convection at a rate proportional 
to its temperature.^ 

The differential equation is now 

07 , „0 2 7 


where a is a constant. Put 


u =h *w- aW ’ 

p + a 2 = h 2 r 2 , 

0 2 F 


dx 2 


= rW. 


and write the equation 
The solution for a long rod is 

Put z +a 2 = £ 2 ; then 

=2y w ex p(^- a2 ) < -i)(?^ 

But if £ = a+ /i, 

the term in 1 /(£—-a) becomes 


+ 




d£. 


_S 

2ni 


as in 20*04 (11). The complete solution is 

ry „r / otx\ _j.x — 2aht\ (ax 

F = i ,5 |^exp ( - T ) (l - ) + ex P ( 


xaA /, .x + 2ahty 


(1) 

( 2 ) 

(3) 

(4) 


(5) 

( 6 ) 


J/ X Pj^-^(l-2 a ')} ex p(-f)7 = ^ ex p(-f)(l-erf^^), (7) 


( 8 ) 


If a? 1 * is small the error functions are practically unity so long as a:/2&f 1/a is large, and V 
is very small. V can be comparable with 8 for small values of the time only if x/2ht 1 ^ is not 
large, and then ax/h is small. Then the solution is practically 8(l — eiixj2ht 112 ), which is 
the solution in the absence of radiation from the sides. 

If af h is large and x/2 lift* not large, the first error function is nearly — 1, and the second 
+ 1, and V == Se~ axlh . This is the solution for a steady state, and will hold so long as 
at 112 — x/2ht x h is large and positive, even if x/2ht 1 ^ is itself large. The steady state is therefore 
reached approximately, for given x, when both t > a -2 , and t > x/2ah. 


* Electromagnetic Theory , 2, 15. 

t Ingen-Hausz’s experiment: cf. Edser, Heat for Advanced Students, 1908, p. 424. 
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If there are several rods of different conductivities but similar surfaces, so that a is 
the same for all, the values of x where V attains a given steady value will be proportional 
to h. 

20*06. The cooling of the Earth. Cooling in the Earth since it first became solid 
has not had time to become appreciable except at depths small compared with the radius. 
It is therefore legitimate to neglect the effects of curvature and treat the problem as one¬ 
dimensional. Radiation from the outer surface must have soon reduced the temperature 
to that maintained by solar radiation, so that we may suppose the surface temperature 
to be constant and adopt it as our zero of temperature. The chief difference from the 
problem of 20*02 is that we must allow for the heating effect of radioactivity in the outer 
layers. Suppose first that the quantity P defined in 20*01 is equal to a constant A down 
to a depth H and zero below that depth. Take the initial temperature to be S + mx (x>0), 
where m is a constant. Then the subsidiary equation is 

3 2 F A 

^L- q 2 V = -?- 2 -q*(S + mx) (0 <x<H) (1) 


= — q 2 (S + mx) (H < x), 

and the solutions are 


(2) 

V = -rrs + S + mx + Be~^ x + Ce qx (0 <x<H) 

(3) 

= S + mx + De -8 * (H < x). 


(4) 

A term in is not required in the solution for great depths, because it would imply that 
the temperature dropped suddenly by a finite amount in consequence of a disturbance 
at the surface. V must vanish at x = 0, and V and dVjdx are continuous at x = H. Hence 

B+C + S + Ajp =0, 

Be~ qH + Ce qH + A/p = De~ qH i 

Be- qH - Ce qH = De-« H . 

- 

(5) 

Solving and substituting in (3), (4) we find 



A 

V — Sil — e-n^ + mx-i —{1 — e -4 * —e^^sinh qx) (0<x<H), 

P 

(6) 

A 

V = S( 1 — e -8 *) + mx -1— (cosh qH — 1) e~ qx (H < x). 

P 

(7) 


The solutions and their derivatives with regard to x involve operators of the forms g -1 e“® a: , 
q- 2 e -qx m These can be evaluated by using Bromwich’s integral and integrating by parts; 
or we can start with 

e-** = l(8) 

|*oo Pec 

and if ^(w) = (1— eriv)dv, $ s (u) = I (v)dv, (9) 

J U J U 

y%) > r*- = (sgs) ■ 


( 10 ) 
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The explicit forms of and O a are 

O x (w) = -T-e -1 **—w(l — erfw), (11) 

y/r 

®a(«) = (l + lu 2 )(l-eTfu)--J T -ue- u *. ( 12 ) 

6\j7T 

The functions have been partly tabulated by Jeffreys and Hartree (cf. 23-08). They are, 
however, intimately related to the functions Hh x and Hh a treated later and tabulated 
in the British Association Tables; it would be useful to have the latter tabulated to four 
figures at a short enough interval to permit linear interpolation. 

In the actual problem a considerable simplification arises from the fact that H is small 
compared with 2 h? 1 *. On this account we can expand the solutions in powers of H and 
retain only the earlier terms (the path for the Bromwich integral being modified to M). 
Then for the surface temperature gradient we have 

( dV\ A A 

Sq + m + p^ q ~ qe ^ H) ^ Sq + m + hY q ^ H -^ 2H ^ + ''' 

A H 

= «Sr + » + -p-(l-isH)+... 




,_, AH , 
2 ~W]hJ(irt) +m+ h? + 


and for the temperature at depths greater than H 


V = 8(1 — e~ qx ) + mx+ 


== mx+ 


(s- i m 

r 2 a 2 / 


h 2 q 2 
) erf 


; + 


AH 2 


2ht^* 2h 2 * 


(13) 


(14) 


The age of the earth is now known to be of the order of 2 x 10® years; with this value the 
term AHjh 2 in (13) accounts for about f of the observed temperature gradient. 

An alternative possibility is that the radioactive generation of heat, instead of being 
confined to a uniform surface layer, may decrease exponentially with depth. The sub¬ 
sidiary equation becomes 

|= -^ 2 e-» x -q 2 (8 + mx) (15) 


at all depths. We already know the part of the solution contributed by S + mx. The 
remainder is 



A e~ ax — e~ QX 
h 2 q 2 -a 2 ~‘ 


(16) 


But 

and 


where 


h 2 (q 2 -a 2 ) p-h 2 a 2 h 2 a 2 { >’ 

- l ( a g \ c -«* 

q 2 — a? 2\q + a q — af 

= \[e~« x - exp (y 2 + ax) (1 - erf (£ + y)}] 

+ |[e-« x - exp (y 2 -ax) {l- erf (£ - y)}], 

£ = x/2ht 1 b, y = ah? 1 *. 


(17) 

(18) 

(19) 

( 20 ) 
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Hence, collecting the terms, 

w = ^2 f 1 *- er f£ -e_a * - i ex P (y 2 +««) (i - erf (f+y )}+i ex p (r 2 - **) {*+®rf (£~y)}l» 

( 21 ) 

which is the same as the solution obtained by Ingersoll and Zobel.* 

The contribution of radioactivity to the temperature gradient at the surface is 

fejb - ^{l-expr 2 (l-erfr)}. (22) 

When y is great, as it actually is, 

To reconcile the various data it is necessary that the radioactivity must be practically 
confined to the outermost 20 km. 

20*07. A spherical thermometer bulb is initially at a uniform temperature equal to that of 
its surroundings. The temperature of the air decreases with height, and the thermometer is 
carried upwards in a balloon at such a rate that the temperature at the outside of the glass varies 
linearly with the time. Find how the mean temperature of the mercury varies. j* 

The temperature within the bulb satisfies the equation 


and the subsidiary equation is 


(rV) 

8 t~ AVy rdr*' V> ’ 




The solution finite at the centre is V = (A/r) sink qr, where A is a function of t. The 
temperature at the outer surface of the glass is Ot, where O is a constant. But the glass 
has only a finite conductivity, and the surface condition at the outside of the mercury 
is nearly 

| V - = K(Gt-V), (3) 


K depending on the ratio of the two conductivities and the thickness of the glass. Then 

A . * ZK °IP _ (4) 

Ka sinhqa + qa cosh qa — amh.qa’ ' ' 

where a is the inner radius of the glass. 

The mean temperature of the mercury is 

3 Ca fa 

V 0 = -= r 2 V dr = — r sinh qrdr 

0 a 2 Jo . a 2 Jo 

3 KG qa cosh qa — sinh qa 

~ apq 2 Ka sinh qa + qa cosh qa — sinh qa * ' ' 


* Mathematical Theory of Heat Conduction , Ginn, 1913. 

t A. R. McLeod, Phil. Mag. (6) 37, 1919, 134; Bromwich, ibid. (6) 37, 1919, 407-19. 
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572 Periodic supply of heat 

In applying the partial fraction rule, we notice that near p = 0 


3 KG {qa’Y{\ + itff a *) 

apq 2 qa(Ka + \Kq 2 a z + \q 2 a 2 ) 


= G 
= G 


(1 (la 2 1 a \1 

5-li53E» + S"= U 


a 2 / 1 

1 h 2 \T5 + 3aK 


3h 2 Kjj 

yi- 


(6) 


Thus there is a systematic lag in the temperature of the mercury in comparison with that 
of the air, which must be allowed for in the measurement of upper-air temperatures. 

The other zeros of the denominator give exponential contributions, which are evaluated 
in Bromwich’s paper. 


20*08. Periodic supply of heat. Reference was made in 18-012 to the possibility 
of periodic solutions of the equation of heat conduction when there is a periodic 
source of heat. As an example, we consider a one-dimensional region where the tem¬ 
perature at x = 0 is given to be S cosytf. Regard this as the real part of 8 exp iyt. Then 


d 2 V 

dx 2 


dV 

= h 2 --^ = hHyV, 


and the solution tending to zero for large x is 

F = Se-te+iyt^ 


where 

Then 

and taking the real part 


A 2 = iyh 2 , A = VV • ^ = (1 + i) K • 

V = 8 exp {iyt — (1 + i) kx), 

V = Se~ KX cos (yt — kx). 


The variation is periodic but its amplitude falls off exponentially with depth, and the 
phase is continually retarded. At a depth tt/k the phase is opposite to what it is at the end 
and the amplitude is reduced in the ratio 1 to e~ n . The changes are more rapid with depth 
the shorter the period. 

This is observed to occur for the diurnal and annual variations of temperature in the 
ground, the former being inappreciable at depths more than about 1 and the latter about 
18 metres. It is important in meteorology, because the ocean is turbulent and heat is 
transferred to much greater depths by mixing. This, even more than the difference of 
specific heats, accounts for the greater ability of the ocean than the land to store the heat 
it receives during the summer, and to warm the air passing over it in the winter. 


EXAMPLES 

1. A uniform conducting sphere of radius a and thermometric conductivity h 2 is initially at tem- 
perature 0. Heat is supplied uniformly throughout the sphere in such a way that the temperature 
would rise at a rate P if there was no conduction. The outside is maintained at temperature 0. Show 
that at any subsequent time the temperature at any point is 

exp (— h 2 n 2 n 2 tfa 2 ). 


1 P 


2P 


V = - — (o 2 -r 2 ) + S—| —| (- 
6 h 2 ' ’ h 2 r\nn) 


( — l)"sin- 





Examples 

if t is small show that an approximation to — 8V/dr at the outside is 
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(M.T. Sched. B, 1926.) 


2. A uniform sphere, originally at temperature S, cools from the surface. The temperature gradient 
at the surface is —k times the temperature there. Prove that after a long time the temperature at 
the surface is nearly 

(M.T. Sched. B, 1928.) 



3. A long uniform string is stretched so as to propagate waves with velocity c. There is a resistance 
to transverse movement capable of producing a retardation equal to A times the velocity. One end is 
suddenly drawn aside a distance y 0 . Prove that (i) there is no motion at distance x from that end 
until time x/c, (ii) the slope at the disturbed end is asymptotically 


c \ntJ \ 4A t 32A H 2 ) 


(M.T. Sched. B, 1928.) 


4. A sphere of radius a with initial temperature F 0 is surrounded by an infinite medium of the 
same material, and with initial temperature 0. Prove that the temperature at distance r( > a) from 
the centre at time t is 


V 77 U C r 


where 


r—a r+a 

2h*Jt* C 2h*Jt 


(I.C. 1942.) 


6. Determine the solution z, valid in the range 0<x<n,t>0, of the equation 

8z S 2 * 

8t dx i ’ 


such that z = 0 for * = 0 and x = n, and z — x for t = 0. 


(I.C. 1941.) 



Chapter 21 

BESSEL FUNCTIONS 


‘Mine is a long and a sad tale!’ said the Mouse, turning to Alice, and sighing. 

‘It is a long tail, certainly’, said Alice, looking down with wonder at the Mouse’s tail; ‘but why 
do you call it sad?* 

Lewis Carroll, Alice's Adventures in Wonderland- 


21*01. Definitions of J± n (x), I± n {x). We have already had (16*10,18*03) the functions 


00 (^X) n+2r 00 (l^\-n+2r 

4W-SC- 1 ' rjf^ v /-.(.)-S(-1 

The corresponding series with all the signs taken positive are 


JM-y 1 lx) - y (**)-+» 

n{ ’ r-orSfn + r)!’ ' r t' 0 r!(-»+r)!' 


Clearly 


I n (x) = e~ llinni J n {xe ltvri ) — e 1,inni J n (ice~ llvri ) 
I_ n {x) = e 1/anff< J^xe 1 !*” 1 ) = e -11 * 11 ™ J_ n (xe~ 1 h 7ri ). 
The differential equation satisfied by J n and J_ n is 

*l( xd £) +{ **- n2 '>y= 0 - 

and that satisfied by I n and I_ n is 

x l( x |)-(* 2 +^ = °* 


(1) 

( 2 ) 

(3) 

(4) 

(5) 

( 6 ) 


I n and I_ n are often called Bessel functions of imaginary argument. 


21*011. Complex integrals: operational forms. Convenient complex integrals 
for these functions can be found by starting with the operator in terms of t real and 
positive, 


CO qT 

p~* exp (up- 1 ) = V~ n £ Q 



(7) 


00 r/Vw+rf-kn+T / #\Va» 

p-»exp (ap-')H(t) = S = ( a ) 4{2>«)}»(')• (8) 

Hence for n > — 1, t > 0, 

g)‘ ta 4{2>t)} = p(*+(9) 

If we now modify the path to M , which has termini at fft(z) = — oo and crosses the positive 
real axis, the integral is significant for all n and can be used also to express I_ n {2 *J(at)}. 
Again, we may put 


zt = u. 


(10) 




21-02 

Schlafli's integrals 

575 

Then (9) becomes 

1 / at\ du 

—J n exp \u + —\ —75 

2m J m \ uju n+1 

(ii) 

and 

U*M)} - 2Vi {at) ' k i M eXp (“ + 9 ^i 

( 12 ) 

which is valid without restriction on t and n. 


If we now put t = 

£x 2 , a = 1 , we have 




(13) 

and with a = — 1 , 


(14) 


as is obvious from comparison of the series ( 1 ) and ( 2 ). 
Another interesting form is got by putting 

u = \xX\ 

4w "skL^M A+ s))s=** 


then 


j »<* )= 2y M exp H A- x 


\\*L 

.)) X n+1 


(15) 

(16) 

(17) 


valid if the termini of M are where $R(#A) = — oo. These are Schlafli’s integrals. Now in 
(16) put 

. 1 A 
A + ^= 2 p\ 


then A = /£ + (/t 2 — 1 ) 1/a , 

the positive sign being taken for fi real and > 1. Then 

dju, 


UX) ~ 2 S J. 


e/“ 


and similarly J.M = 1 )%^+^ + 1 ) % ^ - 

Thus we have the operational forms for I n (x) and J n {x) for n > — 1 , 

J n {x)H(x) = (jp2 _ ifk{p + ( p 2 - 1 yh}n H ( x )> 


JJx)H(x) = 




(p 2 +1 ) 1/a {p + (p 2 +1 ) 1/a } 




(18) 

(19) 

( 20 ) 
( 21 ) 

( 22 ) 

(23) 


21*02. The Hankel functions Hs n (jc), Hi n (aj); Y n (x). Now integrals of the forms (13), 
(14) will also satisfy the differential equations if one terminus is taken at u = 0 , provided 

that the approach to 0 is in such a direction that — -> — oo for I n (x) and to + oo for J n (x). 
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Other solutions of Bessel’s equation 21*02 

This may be verified directly by the method of 16*10. It is convenient again to take % 
real, and to use (16), (17). Then A+ 1/A is stationary at + 1 and A— 1/A at ± i. These 
are saddle-points and we may take the paths through them as follows, for I n (x) 



In either case the sum of the integrals along the two paths, in the directions indicated, 
is I n (x) or J n (x), as the case may be, since the paths are together deformable into M. 

In the latter figure we denote the integral along the upper path by ^Hs w (a?) and that 
along the lower by ^Hi n (a;).* Then 

2 J ni x ) = Hs n (a?) + Hi w (:r). (24) 

Since the dominant parts of the integrands near the respective saddle-points are exp (± ix), 
we see that the three functions are related in the same sort of way as cos x, exp ix, and 
exp ( — ix), and if we also take 

2iY n (x) = Hsjx) - Hi n (z) (25) 

Y n (x) will be analogous to sin a:. Also 

(26) 

“»<*> - <27) 

The way of writing the limits in (26) means that the path goes from 0 to — oo by way 
of i. These integrals, being analytic functions of x, will also be solutions of the differential 
equation for complex x such that ffi(a;) > 0. 

The paths are transformed into themselves by the substitution 

A = -l /u. (28) 

* These functions were introduced by Nielsen and denoted by H*(x), H*(x). Watson denotes them 
by H$\x) and H ( *\x). The former notation has the disadvantage that the same n is a suffix in J n (x); 
the latter is awkward in printing and writing, and almost impossible on a typewriter. We use Hs„(a;) 
and Hi n («), the ‘s’ and ‘i* meaning ‘superior’ and ‘inferior’ in accordance with the paths taken for 
the Schlafli integrals. Other notations are in use. 
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But if we are to maintain the rule that arguments are to be taken as 0 (not 2n) for quan- 
tities on the positive real axis we must take respectively, 

for (26), A = e in ju; for (27), A = e~ in /u 


and then 


Hs n (a;) = ~e~ niri 

7TI 


£ J o ^ exp \x [u - ij u n ~ x du = e- n7ri Hs_ n (a:), 


(29) 

(30) 

(31) 


Hi» = e nni m_ n (x). 

But we have also 

, , „ 2«/_ n (a:) = Hs_ n (z) + Hi_ n (a;) = e nni Ks n (x) + e~ niri m n (x), (32) 

and therefore 

i sin nn Hs n (a;) = J_ n (x) - er*™ J n {x), i sin nn Hi n (z) = e nni J n (x) - J_ n (x). (33) 

All functions in the last two relations being analytic in both x and n, they can be taken as 
definitions of Hs n (a;) and Hi tt (x) except when sinnzr = 0. In all cases they will be equal to 
integrals derivable from (26) and (27) by continuous modification of the path in such a 
way that 3l(a;A) —oo at one terminus. Also from (25) 

Y n (x) = ,,,, 

Sill 717T ' ' 

When n is a positive integer «/_„(*) reduces to (-1)” J n {x) ; but when n approaches a positive 
integer this expression for Y n tends to a definite limit, which we can then take as the 
definition of Y n (x). We take then* 

Y n (x) = lim 008 + 6>)7T ~ J -n-e( x ) 

n e—>-o sin(n + e)7r 




ite 


n{dn Jn( l)n dn J ~ n ^' 


(35) 


The terms in (1) fall into two classes according as there is an infinite factorial in the 
denominator or not. The general term of JJx) is 

(£aO n+ar 


u n ,r( x ) = (-!)' 


r\(n + r)\ y 


and 


d 

3 n 


u„Jx) = (-l) r ^^* } {logix-F(n+r)}, 


(36) 


(37) 


r! (n + r )! 

where F is the digamma function. A similar result holds for the terms of J_ n (x) with r^n. 
But for J_n with r<n we have 

, ,. (lx)- n - e + 2r (l T \-n-e+2r 

(-1f-rr————= (-1)^ ,,- ( —g)( — e— !)...( —e—n + r+1), (38) 


3 u_ 


3 n 


r\(-n-e+r)\ v r!(-e)! 


* An astonishing variety of notations exists in different accounts of Bessel functions. Watson’s 
and our Y n is called N n by Jahnke and Emde, -G n by Heaviside, and ± 2KJn by various other 
writers. G n (x) of Gray, Mathews and MacRobert is \%tt Hs„(*). G n has also been used in other 
senses. For details of notation for this and other Bessel functions see Mathematical Tables and 
other Aida to Computation, 1, 1944, 207-308. 
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578 Integrals on real paths for J n (x), Y n (x) 

Hence Y n (x) = \ h - 1)'{log i* - F(n +r)} 

1 00 ^l 3 '\-n+2r 

+ _ ( _ l ) ^(-ir^^ j {Io g i*-F ( -„ + r ) > 
7r r =o r! 

Put r = n+8 in the second series; it becomes 

y»(*>=|j»iogi*-; 


21*021 


and 


7t ' 7Tr=0 

1 n ~ 1 ( n—r— 1 )! 


-- S 

7T r -0 


r! 


(\x)~«+» 


(40) 


(41) 


(42) 


for n a positive integer or 0 . There is always a singularity of T n at the origin, and it therefore 
cannot arise in any solution that holds at the centre of a circle. It can, however, arise in 
solutions that hold between two concentric circles. 

21*021. Integrals on real paths for J n (x), Y n (x). Another integral expression for 
Hs n (a;) can be got by specifying the path of (26) definitely as follows. 



-1 0 

On the stretch 0 to +1 put A = e - "; then this part is 

If 00 If® 

J exp —e M )} e nu du = ^ J exp (— x sinh u + nu) du. 

On the semicircle +1 to — 1 put A = exp (id); then we have 

1 C n 

- exp (ix sin 6) exp (— nid) dd. 
nj o 

From — 1 to — oo put A = exp (u + in) ; this part is 


and in all 
Hs 


1 f°° 

— I exp (— x sinh u) exp ( — nu — nni)du t 
nij o 


(43) 


(44) 


(46) 


1 f° 


Jx) = - f exp (ix sind—nid) dd + —.\ exp (— x sinh u) (exp nu + exp ( — nu — nni)} du. 
njo nij o 

Since when a; is real Hs n and Hi n are conjugate complexes 

1 f ff i If® 

HL(*) ■ - exp(—ix sind+nid)dd — —. exp (— a; sinh u) (exp nu + exp ( — nu + nni)}du 

njo o (4?) 

1 f» If® 

and J n { x ) — ~ I cos (a; sin d—nd) dd — exp (— x sinh u — nu) sin nndu, (48) 

njo nj o 

1 fff 1 f® • 

Y n (x) = - I sin(a;sin d—nd)dd — exp(- x sinh u) (exp nu + exp(— nu) cos nn)du 

tfjo (49) 






21*022 Integrals on real paths for I n (x), Kh n (#) 579 

valid for 91(a) > 0; the first is also valid for 9ft(a;) = 0 if 9?(n) > 0 . If n is a positive integer 
J n {x) reduces to the first integral, which is known as Bessel’s integral. It occurs in the 
expression of the radius vector in planetary motion in terms of the eccentric angle. 

21*022. Integrals on real paths for I n (x); the second solution Kh n (a). We can 
get similar expressions for the integrals along the paths for I n {x) by taking a; to be com¬ 
plex with a positive real part and choosing the appropriate path; or we can proceed directly 
as follows. Using (16) we take the path in the A plane 



On the circle put 

Then its contribution to I n (x) is 


A = e™. 


1 1 

— J exp ( x cos 0 — niO) dd = - J exp {x cos 6) cos nddd. (51) 

From —oo to — 1 and — 1 to — oo put 

A = exp ( u — in), A = exp ( u + in). (52) 

The contributions are 

If 00 If 00 

— 2jrij ex P (~ x c °sh u ~ nu ) eni ” du + ; J exp (— x cosh u — nu) e~ nin du 

sinnn f 00 , . . 

= --—J exp( — a; cosh w — nu)du. (53) 

1 rn sinri7r 

Hence I n (x) = - J exp ( x cos 6) cos nddd --— J exp ( — x cosh u — nu) du. ( 54 ) 

i sin tut r™ 

Also I- n ( x ) — ~ ex P ( x cos 6) cos nO dd -i-exp (— x cosh u + nu) du ( 55 ) 

njo n Jo 

2 sin 7Z7t 

and I-ni x ) — I n ( x ) — - exp (— x cosh u) cosh nudu ( 66 ) 

w J o 

= sinw7rKh n (a;) (57) 

say.* Then Kh n is significant even when n is an integer, subject to 91(a?) > 0. 

Again, take the integral from — oo to 0 by itself, with arg A — —n. Then the part from 
If 00 

— oo to — 1 gives — e nifr l exp (— x cosh u — nu) du as before. From — 1 to 0 put 

A = exp( — u — in). 

* This function is that used by Heaviside; Watson, following Macdonald, takes K n (x) equal to \v 
times this, though he recognizes explicitly that the factor complicates the relation to Hs„(a?) and 
Hi„(*). It also complicates the relation to the Legendre function Published tables refer to 

Macdonald’s function; but the occasions for using the relations between K n , Hs„, and Hi„ are so numer¬ 
ous that the most convenient procedure would be to divide the published tables by \n. We write Heavi - 
side’s function as Kh B («) to distinguish it from Macdonald’s. 


37-2 
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This gives 


Forms for Kh n (a;) 


2103 


and the whole integral is 


27 n J o 


exp (— x cosh u 4- nu) du 


i r°° gniw 

—e nin \ exp (—x cosh u) cosh nudu = —^r- Kh n (#). 

Kh n (a;) can be transformed back to the Schlafli form; we have 


(58) 


2 f 00 1 f 00 

Kh n (a;) = - exp (— x cosh v) cosh nvdv = - exp (— x cosh v + nv) dv (9t(a;) > 0). (59) 

7TJ 0 7TJ-CO 

Put e® = A; then 1 /•» ( / i\\ 

Kh^a;) = - exp j — \x IA+^) J A n_1 d\, (9d(a?) > 0); (60) 

and if \xX = u, 

Kh n (a;) = i (|a;)- n J exp u n ~ l du, | arg x\<\it, (61) 

Clearly Kh n (*) = Kh_ n (a;), so that we can always take n^0. 

By continuity the same forms will hold if 9d(a;) = 0 or 9d(a; 2 ) = 0 provided that the 
integrals converge. In particular (61) is true if 3d (a: 2 ) = 0 and n > 0. 

Kh w can be expressed directly in terms of Hs n or Hi n . For, using our original relations 
between the I and J functions, we have 


sin 7177 Kh n (z) = I_ n (x) - I n (x) (62) 

= e^ nni J_ n (xe lkni ) - er'hnvi J n ^ X e^ ni ) (63) 

= i sin nire llinni Hs n (xe 1,2ni ), (64) 

sin nn Kh w (a;) = c~ 1/anwi J_ n (xe~ lli7ri ) — e 1/a7m ‘ J n (xe~ lk7ri ) (65) 

= — i sin mrer^™ 1 Hi n (a:e -1/2,rt ‘). (66) 

Hence Kh n (a:) = ie 1/an7ri B.a n (xe lh7li ) = — ie-^hnm ’H.i n (xe~ 1 ^ ni ). (67) 

In consequence of these relations all the Bessel functions for given n can be expressed 
in terms of the single function Kh n (#). When n is a positive integer or 0 we can represent 
Kh n («) by 

(68) 


and in particular 


(- 1 l”* 1 14(*) log «*) + ('- !)” \ J o r ( ffi7 f * i W + + r)} 

+1 s\- 

7T r=0 rl 

2 2 00 (ix)®" 

KM*) = -~W log(£*) + H~r^( r )- 


(69) 

(70) 


21*03. Further complex integrals for J n (x). Other integral representations of the 
Bessel functions can be found by putting in the equation for J n (x) 


Then 


y = x-^u. 

xu" — (2n— l)u' + xu = 0. 


(71) 

(72) 
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Substitute u = je zx Zdz (73) 

along a path to be determined; then 


xu” — (2n— l)u'+xu = f{xz 2 —(2n—l)z+x) Ze^dz 

= [(z 2 +1) Ze zx ] - Je? x {(z 2 +l)Z' + (2n+l)zZ} dz t 
(2n+\)z 


and if the integrand is to vanish 


Z 


z 2 +l 


ZOC (z 2 + 


Then a solution will be 


u 


2m J, 


m(z 2 + l) n+1 >* 


dz. 


(74) 

(75) 

(76) 

(77) 


where $((zx) -> — oo at the ends of the path. If n > the integral will converge on the 
standard Bromwich path, and we have the operational form pf{p 2 4- l) n +^. The first term 
in the expansion is p~ 2n = x 2n j(2n) !. This identifies the solution as a constant multiple 


2 n n\ 


of x n J n (x), and indeed it is — ~ x n J n (x), as we can verify by direct expansion of the 
denominator in descending powers of z. Then using the multiplication formula for (2 n) ! 


V 7T ™ 

(p 2 +1 )n+ 1 k^^ = 2 n (n—%) ! xn ^ n ^ (78) 

This result was found in this way by van der Pol. Apparently, though the equation (72) 
is of the second order and must have another solution xPJ^Jx), we have found only one 
solution. But the integrand in (77) has branch points at ± i. If we take a figure of eight 
contour surrounding them, or a loop from — oo about either, we shall obtain other solutions. 
Analogous integrals are used by Watson in Chapter 6 of his book. 

If n + \ is an integer the integrand is single-valued and the solutions will be expressible 
in finite terms, as we have already seen in considering the asymptotic expansions for this 
case. The present form has one advantage over that used by Watson: he gets a factor in 
the integrand that would be (z 2 +1 ) n_1/a in the present notation, and if n—\ was a positive 
integer or zero the integrals would vanish. This complication is avoided by having the 
factor (z 2 +On the other hand his form is more manageable when the path is 
reduced to a loop about a branch point. 


21*04. Recurrence formulae. Returning to 


J “ ( * ) = 2S (ix) ”L eXP (“ _ i9^ i 

and differentiating, we have 

- 2S (ia!)n L eXp (“-s3 ia: ^ 

= ^ J n( x )- J n+ l(aO- 


( 1 ) 


( 2 ) 
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21*05 

Differentiation of 

j » (x)= ^L exp M A a))a-« 

(3) 

gives J' n (x) 

= 2^L eXp {^( A -l))s(^-A^) dA 



= Wn-l( x )~$Jn+l( x )' 

(4) 

By subtraction 

J n -i(x)+J n+1 (x) — —J n (x). 

(5) 

Also from (4) and (6) 

J'n( x ) = J n -l( x )-^ J n( x )‘ 

(6) 


These differentiations have been carried out on the integrals over the path M, which 
give J n (x). But they could equally be done on the paths used for Hs n (a;) and Hi w (a;), 
which therefore satisfy the same recurrence relations; and then by the definition of Y n (x) 
this also will satisfy them. 

The corresponding relations for I n (x) and Kh n (a;) are somewhat different, on account 


of the difference of sign of 1/u and 1/A in the exponent. We find 

r n( x ) = ^40*0 + 4+i(*) * ”4(a) + 4-i(*) = i4-i(») + ¥n+i( x )> ( 7 ) 

00 oo 

4-i(*)-4«W=-4W. (8) 

KK(*) = -KKW-KW*) = — ^ Kh n (£) — Kh n _ 1 (a:) = -iKh,_ 1 (*)-lKl Wl (*). (9) 
00 00 

9m 

Kh„_ 1 <z)-Kh„ +a (x) = --Kh„(z). (10) 

In consequence of these relations it is possible, given any Bessel function for n — 0 and 
n = 1, to build up the same function for any integral n. 

Particularly important cases are 

J'o(x) =-J x {x), ^{xJ x {x)} = xJ 0 (x); (11) 

4 (*) = 4 (*)> = * 4 (*); ( 12 ) 

Kh^(a;) = -Kh^a;), ^{xKh^x)} =-xKh 0 (x). (13) 


These occur in hydrodynamical problems relating to cylinders, where if the radial velocity 
depends on J 0 the azimuthal velocity depends on J x , and conversely. 

21*05. Asymptotic formulae of Stokes’s type. These are most easily obtained 
for Hs n (z), Hi n (z), and Kh n (z), on account of the fact that the natural paths of integration 
to use for them pass through only one saddle-point. The forms 21*02 (26) (27) are perhaps 
the most convenient for getting the first term, since with them the positions of the saddle- 
points of the exponential factor are independent of x. Writing 

/(A) = \x |A — , /"(*) = - x/i 3 = - ix. 


(1) 
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the integrand for Hs n («) at A = i is 

i exp (ix) exp { - \{n +1) 7 n) (2) 


and the path of steepest descent, for x real and positive, is at \n to the positive real 
axis. Then , 

1 / 27 t\ '2 

Hs n (x) ~ ^ exp i{x — — \n] I —I exp (f ni) 




- 

l 2 \V* 

[—) exp {i(x — \nn — \n)} t 

(3) 

and similarly 

“»<*> ~ (s) 

Va 

exp { — i{x — \nn — \i r)}. 

( 4 ) 

Hence 



\ Va 

-1 cos (x — \nn — \n). 

(6) 




\ \Va 

-1 sin (x — \nn — \n). 

(6) 

These determine the coefficients of the various first terms. The rest of the expansions can 
be determined from some of the integrals, but are most easily found from the differential 
equation as in 17*12.1. We have 

Hs n (z) 

/ 2 V' a 

= y—J expi{x-$nn-ln)(U-iV), 

(7) 

Hi n (z) 

/ 2 \Va 

= (—) exp{-i(a;~|ti7r~i7r)}(D r +iF), 

(8) 

<*.(*) = | 

f±) 

[nxj 

1/2 

| {U cos (a; 

— %nn — In) + V sin (a; — \nn — \n)}. 

(9) 

r,{») = | 

(2) 
\nxj 

i/a 

| {U sin (x 

- \nn — \n) — V cos (x — \nn — \n)}, 

(10) 


rr _ . i (1 ~ 4 ” 2 ) ( 9 ~ 4 ™ 2 ) , (1 - 4 " 2 ) ( 9 ~ 4n 2 ) (25 - 47t 2 ) (49 - 4w 2 ) 
2! (8a;) 2 + 4! (8a;) 4 

F 1 - 4n 2 (1 - 4n 2 ) (9 - 4n 2 ) (25 - 4n a ) | 


8a; 


3! (8a;) s 


( 11 ) 

( 12 ) 


These are Stokes’s expansions. We have taken x as real and positive, but we see that this 
condition can be relaxed. For with a complex x we still obtain the correct integrand at 
A = i by direct substitution. The direction of the path is rotated by — \ arg x, and this is 
allowed for by the factor x _1/a . Yet there is a limit to the range permitted to arg x. For, let 
us suppose it increased continuously by 27r. J n (x) is multiplied by exp (2?w7r) and returns 
to its original value if n is a positive integer. But each term of (9) is multiplied by exp (— in). 
We know already that an asymptotic expansion cannot be correct for all values of 
arg a; unless it converges; we see here that to permit unlimited variation of arg a; would 
lead to seriously wrong results. To trace the origin of the change we notice that as 
arg a; increases, if we keep to paths of steepest descent the path near each saddle-point 
rotates negatively by half as much as arg a; increases, and further arg A must decrease 
at | A | =s oo and increase at J A | small, by the same amount as arg x increases. 
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21-051 


Stokes's expansions for I n {x), Kh n (a;) 

Let arg# increase from 0 to 2n. Then the paths of steepest descent are exactly as at 
the start, but the directions of travel through the saddle-points are reversed, and therefore 
they are traversed in opposite directions to their original ones if we try to maintain the 
continuity of the approximation. But this is clearly wrong. For with n a positive integer 
we must get back to the same value of J n (x), and Hs n (#) is always an integral from 0 to 
oo exp (in — i arg x), and Hi n (a;) one from oo exp ( — in — i arg x) to 0, the two together 
constituting a loop in the positive sense about the origin. The reversal in sign must come 
from a failure of the asymptotic expressions to represent Hs n (#) and Hi w (a;) over the 
range 0 < arg a; ^ 2n, and the reason is that at some value of arg a; continuous deformation 
of the steepest descents path near a saddle-point makes it change from one going from 
the origin to infinity instead of from infinity to the origin. Thus let arg a; = \n. The 
path for Hs n (a;) is straight along the imaginary axis, and there is no trouble. But 
that for Hi w (a;) is as shown, a dotted line indicating the path 
for arg a; a little less than \tt. If arg a; is a little greater than \tt> 
we could take a path in a suitable direction from infinity to i, 
round the circle, and from i to the origin again in a suitable 
way, and Hi n (a?) would be given correctly. But such a path would 
not be the steepest descents path (the other dotted line), which, 
if it is going from left to right near — i, would cross the circle from 
inside and go from 0 to infinity. We can see that the difference of 
the integrals along the two paths is 2Hs n (a;). This is irrelevant 
to the definition of the asymptotic expansion; for every term in 
Hs n (ic) contains an exponentially small factor when 0 < arg x<n, 
and the whole series, when | x | is large enough, will be small compared with any given term 
of the series for Hi n (z). The expansions are therefore both valid when 0 < arg x<n; but at 
arg x = n the moduli of the first terms become equal, and the new portion of Hi n will 
become the larger. Hence the asymptotic expansion of Hi n (a;) is not valid when arg x>i r. 
That of Hs n (aO will similarly begin to represent a function with a multiple of Hi n (cc) 
when arg x = §7r, and the new part will become the larger when arg x = 2 tt. A little 
further consideration shows that the range of validity of the asymptotic expansions is 
— tt < arg x < 2tt for Hs n (aj), —2n< arg x<n for Hi n (a;). The expansions of J n {x) and 
Y n (x) based on them are therefore valid for — tt < arg x<n, but of course not at arg x = ± n. 



21 - 051 . Asymptotic formulae for I n (x) and Kh n (#) can be derived from these by change 
of argument, using (3), (4), or directly from the integrals 21-022 (51), (60). The later 
terms are found from the differential equation. 


40*0 


*J(2nx) 


1 — 4n 2 (1 —4w 2 )(9 —4w 2 ) 

1! 8a; 2! (8cc) a 

l 1 —4n 2 ( (1 —4n 2 )(9 —4n 2 ) 


4 


l\8x 


2! (8a:) 2 


(13) 

(14) 


It can be shown that for real argument the error in stopping at any term of the series 
U and V has the same sign as the first term neglected, and therefore the function lies 
between the sums of r and r +1 terms of the series if the number of terms retained is so 
large that consecutive terms alternate in sign.* The same is true of Kh n (ar), but the 
corresponding inequality for I n (x) is more complicated. 

* Watson, Theory of Bessel Functions, p. 209. 
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21*052. Interpretations of Kh 0 (pw/c), pKh 0 (pxu/c), Kh 0 (gtEr). The physical applica¬ 
tions of the functions can be distinguished by the exponential factors in the asymptotic 
expansions. Thus where we should write for a wave travelling in the positive direction 
of the axis of x, in one dimension, 


cos K{ct—x) = 9ft exp tK(ct — x), 


we should write for a symmetrical wave in two dimensions 9fte lVc< Hi 0 (/tt27), the phase of 
which, when kw is large, will travel outwards with velocity c. The function D 0 (x) used in 
Lamb’s Hydrodynamics is — iHi 0 (a;). The function Hi n (x) is therefore specially convenient 
for treating spreading harmonic waves. 

The finiteness of I n (x) at the origin makes it the suitable function to be used for 
problems dealing with the interior of a circle; but since I n (x) tends to infinity with x the 
proper function of imaginary argument to use outside a circle is Kh n (ce). Kh 0 (pa), where 
p is the Heaviside operator, occurs regularly in problems of the spreading of cylindrical 
disturbances and plays a part similar to exp ( —ph) in one dimension. In its simplest form 
we have 

2 r°° 

Kh 0 (a;) = - I exp (— x cosh u) du. ( 1) 


Then 


Ia °(^r) H w - 2klJTL"^ ( z( -T oosh “) 

= -f IIit- — cosh du, 
rrjo \ c ) 

I W\ 2 /*cosh->c</w / Vj\ 

=o K) : =iJo iu K) 


and therefore in general 


= - cosh -1 —.H 

7 T W 


(-')• 


( 2 ) 

(3) 


The operator Kh 0 (puj/c) therefore gives a disturbance beginning at time mjc and spreading 
out with velocity c; its magnitude at a given place will ultimately increase indefinitely 
with time. Differentiating, however, we get a more usual operator 




(4) 


These show at once a characteristic feature of waves in two dimensions; unlike waves in 
one or three dimensions, there is no sudden end to a two-dimensional disturbance, but an 
indefinitely prolonged trail. 

Again, if h>0,p = h 2 q 2 , (5) 


Kh 0 {qm)H(t) = - 

7T 

_ 1 
7T 

_ 1 
71 

_ 1 
7T 


f°° / 1 q^X du 

-j—)-H» 

j>(— 

I 


— duH{t) = -ei 

vj'IlhH u 7T 


1 . / w 2 \ 

_ " ei \m) 


my 


( 6 ) 
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Differentiating with respect to t we have for t > 0 

g— w*/4h*t 


and also 


1 p -w*/ih*t 

ri Kh JqTT!) Hit) =--- 

7r w 2 j4:h 2 t 4h 2 t 2 nt * 

qwKk 1 (qm)H{t) = -graiKliJfgro) = -ro^Khjfgro) 
7T m 2 /4Ji 2 t 4Ji 2 t it 


21 06 


(7) 


( 8 ) 


These operators, which occur in problems of diffusion about circular cylinders, are 
special cases of some that lead to the confluent hypergeometric function. 

21*06. Functions of large order: approximations of Green’s type. Asymptotic 
approximations for large n have been found (17*132), apart from constant factors, by 
direct study of the differential equations. The constant factors can be identified by com¬ 
parison with the Stokes expansions for x much larger than n. The approximations can be, 
and originally were, found by Debye by the method of steepest descents. We illustrate this 
by means of I n (x) and Kh n (x) for x real and positive. We have from 21*022 (57), 

2 C°° 

Kh n (a;) = - J exp (— x cosh u) cosh nu du 

If 00 i /*» 

exp( — xcos\iu+nu)du = -J exp {f(u)}du (1) 

since the term in exp (-nu) decreases steadily with increasing u and is easily shown to 
be negligible in comparison with that in exp (nu) when n is large. The path of steepest 
descent is the real axis. The integrand is a maximum when 

f'(u) = n — x sinh u = 0; 

and then sinh u = nfx, f”(u) — —x cosh u, f(u) = n sinh -1 ~ — (x 2 + n 2 ) 1 ^, 


( 2 ) 


Kh n (z) ~ x~ n (n 2 + x 2 )~ l l*{n + (n 2 + x 2 )^} n exp { - (n 2 + x 2 ) 1 '*}. (3) 


For I n (x) we can use 21*011 (20), 

4< * )= 2 

w ^ ere f(/i) — fix — n log {[i + ([i 2 —l )%}. 

There is a saddle-point where ,,, % 


and then 


/(/*) * “ 0; 

, /1 , w2 \ 1/a \ Ufi x 2 (x 2 

^ \ 1+ aT 2 ) * ^ ^ “ (ji 2 - 1)% " n (w 2 * 7 ’ 

M = (» 2 +n 2 ) 1/a -niog|^i+—j /2+ ^j* 


(4) 

( 6 ) 

( 6 ) 


(7) 
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Applications of the Wronskian 
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Since is positive the path of steepest descent is parallel to the imaginary axis, and 
we find 


I n {x) ™ x n (n 2 + x 2 )- 1 ^ + (n 2 + a; 2 ) 1/2 } -n exp(w 2 + a^) % . 

The corresponding approximations to J n (x) and Y n (x) are, for x<n, 

1 


JJ?) 


V( 2 ") 


x n (n 2 — x 2 )~ lli {n + (n 2 — x 2 ) llst }~^ exp (n 2 —x 2 ) 1 ^. 


( 8 ) 

(9) 
( 10 ) 
( 11 ) 
( 12 ) 

'S V'/ 

where sect? = x/n. 

Later terms in approximations of this type have been obtained and are given by 
Bickley* to order n -11 , but the recurrence relation is more complicated than for the 
Stokes expansions. The first term for x>n actually gives quite a good approximation 
down to the first zero of J n (x). 


Y n (x) r^j — J x~ n (n 2 — a; 2 ) -1 / 4 {n + (n 2 — x 2 f l2 } n exp { — (n 2 — x 2 )} 1 ^, 

and for x > n J n (x ) ~ ( x2 “ n2 )~ y * sin (w(tan v — v) + Jtt}, 


Y n (x) ^ — /1 — 1 (a? a — n 2 )~ lk cos (w(tan v—v) + In}, ' 


21*07. Applications of the Wronskian. Write Bessel’s equation in the form 


Then if the Wronskian of any two solutions y v y % is taken 


w (Vi>y 2 ) = a exp 


(-r $) - 


In particular 

The constant can be fixed by considering the first terms. 


J' n (x)J_ n (x)-J'_ n (x)J n (x)=-. 


( 1 ) 

( 2 ) 

(3) 


T , X ( \x) n T . . (\x)~ n T ,. » i(i*) n_1 

«4(*) ss “r-+-» J -n(*) = r-Z7i+-» J n(x) = l - TT- . + .. 


n\ 


(-71)! 


and 


A = 


(71-1)! 
2 n 


(n— 1)! ( — n)\ {—n—\)\n\ n\{ — n)\ 




2 sin mr 


TT 


(-»-!)! 


+ 


( 4 ) 


It follows at once that for n not an integer J n (x) and J^x) are two independent solutions, 
but become proportional when n is an integer. Also 

*7n0») Y' n {x )- J' n (x) Y n (x) = (5) 


7TX 


rjx)-Kh n {x)-I n (x)KK(z) = 

7TX 


( 6 ) 


The factors in (5), (6) also follow easily by considering the asymptotic approximations for x 
large (21-051). (The existence of asymptotic expansions of the derivatives follows at once 
from the recurrence relations.) 


Phil. Mag. (7) 34, 1943, 37-49. 
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21*08. Functions of order half an odd integer. We have seen from the asymptotic 
expansions (17* 121) that these can be expressed in finite terms. This can be done compactly 
by means of the recurrence relations. We take in each case the relation that contains the 
function of order n + 1, that of order n, and the derivative of the latter. Thus multiplying 
the relation for J n+1 (x) (21-04 (2)) by ar* 


d 


= - X~ n J' n (x) + nxr^-'JJx) = - — { X ~nJ n ( X )} f 


( 1 ) 

( 2 ) 


(3) 

(4) 

(5) 


z- n " 1 J n +i(s) = - — {ar+ J n {z)}, 

and by induction J n+m( x ) = ( - l) 81 ^ {x~ n J n (x)}. 

( 2 

—) sin®, 

and therefore = g) % ( -1)» *”+'»■ ( . 

Since we know the exponential factors for the Hankel functions, we infer at once 

= g)Vl)^' fe (A)”(-A), (6) 

Hw^) = g) % (-l)-^(^)“(^), (7) 


( 8 ) 


Since Iy 2 (x) consists of the same terms as Ji^(x) with all signs taken positive we have 
immediately 

£ ^>=(ir 

and from the recurrence relation (21-04 (7)) 


sinh x. 


(9) 


= ^fa( X ~ nI n( X )h 

Kh ** r) “ (nxT e ~*’ 
Kh m+% (r) = (T (- ( A)* 


whence 
Also 

~ ,a ' ' \ttx) ~ 

TCh ... (w\ — 1 \m^m+ 1 f a _ _^ 

\xdxj X 

21*09. The functions ber, bei, kher, khei.* These are defined by 
ber n («) ± i bei n (x) = J n (xe ±3! * ni ) = e ±1 ^ nni I n {xe ±1 l* ni ), 
kher n («)±ikhei n (a;) = e ±1/2?wr * ~Kh. n (xe ±llini ) = »Hs n (a;e ±s/4,rf ). 


( 10 ) 

( 11 ) 

( 12 ) 

(13) 


* A. Russell, Phil. Mag. (6), 17, 1909, 524-552; C. S. Whitehead, Quart. Joum. Math., 42, 1911, 
316-42; H. G. Savidge, Phil. Mag. (6), 19, 1910, 49-58. 
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They occur especially in problems of periodic heat flow and slow periodio motion of 
a viscous fluid with cylindrical boundaries. kher n (x) and khei n (x) are 2/n times the 
tabulated functions ker n (ar), kei n (g). The properties of the functions are easily inferred 
from those of I n {x) and Kh n (g) for complex argument. 


If | g | is small, 
whence 


Kh 0 (x) = --log(|g). 


kher 0 (g) = - - log 11 x |, khei 0 (g) = - |(g > 0), = |(g < 0). 


If p _1 /(0 denotes J f(t) dt, 

p-n e iipH(t) = ^/ 2 n e -%»w< |ber n 2 <Jt + i bei n 2 <Jt} H(t). 

21*10. Expansions and definite integrals. It follows immediately from Schlafli’s 
integral and Laurent’s theorem that if n is an integer, positive or negative, J n (x) is the 

coefficient of A n in the expansion of exp \x ^A — in positive and negative powers of A. 

This can also be shown directly by multiplication of series. For this reason the Bessel 
functions of integral order are often called Bessel coefficients. We therefore have 


exp \x |a - i j = 21 d n {x) A n 

without restriction on x and A. Put A = exp id. Then 

00 

exp (ix sin 6) — 21 J n ( x ) enid - 

n——oo 

Multiply by exp (- inO) and integrate from —n ton; then 

exp i(x sin 6 — nd) dd = 2nJ n (x), 


j: 


and 




cos (x sin 0—nd) dd t 


( 1 ) 

( 2 ) 

(3) 

(4) 


a particular case of the result for general n. 

If n — 0 we can replace 6 by d — \n in (4), and 

If" \ C* 

J Q (x) = — I cos{xcoad)dd =—j exp (ix cos d) dd. 

poo 

Now consider / = J e~ ax J 0 (bx) dx (a > 0, a, b real) 

l poo p*r 

= — J e~ ax dx J exp (ibx sin d) dd . 

Integrate first with regard to x; then 


(5) 

( 6 ) 


1 1 

dd 

. 1 I 

r e ie dd _ 1 j 

f dz 

2nJ 

~ n a — ib sin d 

~ 2n } 

1 ae ie — \be 2id + \b nij 

1 bz 2 — 2az — b 


( 7 ) 


taken round the unit circle. The poles are at 

bz = a ± <J(a? + 6 2 ). 
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Relation to Kh 0 (atu) 21*10 

For real a, b these are real, and one is inside and the other outside the circle. The integral 
is then found to be ( a 2 + 6 2 ) _1/2 . With a change of notation 

' < 8 > 

with z>0, z,m real. We can regard tu, z as cylindrical coordinates; then the right side is 
simply r~\ and we have expressed the fundamental solution of Laplace’s equation in 
terms of solutions of the equation in cylindrical coordinates. This is valid for z> 0; for 
z < 0 we evidently must take the exponent as + kz. 

Now we should expect that a solution of Laplace’s equation in cylindrical coordinates 
(tu, A, z) can be expressed in another way. For subject to convergence conditions, if we 
keep tu, A constant, the solution <}> can be expressed in terms of cos az and sin az as a Fourier 
integral J/(a, tu, A) cos azda 4- Jg(a, tu, A) sin azda, and the same will apply to V 2 ^. But 
if V 2 0 = 0 for all z , / and g must satisfy 



and must therefore be of the form {AI n (am) + BKh n (atu)} (cos n\, sin wA). Such an 
expression has obvious drawbacks, since I n (aw) tends to oo as tu->oo, and Kh n (atu) to 
infinity as tu->0. But for a distribution with a singularity at tu = 0 and tending to 0 at 
infinity we may expect the Kh n solution to be admissible. It can be obtained as follows. 
We may think of the problem as a potential one, with a line density proportional to cos az 
along the axis of z. Then the potential on z = 0 due to such a distribution will be 


~ 2 *J„ “ */. (9) 

with an indentation in the path about £ = im. Put £ = i/c and then k = m cosh v. Then 
we have, the integral up to k = m being purely imaginary, 


f 00 dK f 00 die f 00 

23 i J. = 2 J.< 10 > 

from 21*022 (59). 

Alternatively, the potential close to the axis must be — 2 log w times the line density. 

2 

But for tu small Kh 0 (atu) behaves like — log {\axa), and therefore uKh 0 (atu) cos at will 

7T 

behave like — 2 log tu cos a£. Then the potential due to a line density cos az will be 

77 - Kh 0 (atu) cos az. 

Now take the density to be ^ from z — —h to z = h and otherwise zero. Then we 

/*oo 

can express it as a Fourier integral /(a) cos azda, with 

J o 
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This is therefore the factor to be associated with cos az in the expression of a line density 
uniform for —h<z<h } and the limit function is l/n when h is small and the distribution 
reduces to a unit mass or charge. Then we have 


1 

r 


j: 


= Kh 0 (at«7) cos azda 

o 


(w > 0 ). 


( 12 ) 


The integral converges at both limits except when m = 0. 

It is instructive to see how the apparently quite different expressions (8) and (12) can 
be connected directly. Starting with (8) we write it as 


1 

r 


lj o &~ ks {Bs, q (kw) + Hi 0 (/nz7)} Ik. 


(13) 


The two parts must be treated separately since Hs 0 (a;)-*0 at x = + too, Hi 0 (rr)-»»0 at 
x = — too. Then 


1 

r 


1 /*<«> 1 r-ico 

= 2J o c-^Hso (K 7 u)dK+ -I e- KZ Hi 0 (Kur) die 

If®. If® 

= 2 J Q e -t “'»Hs 0 (i am) doc - -I e***iHi 0 ( - iocm) da 

= | e- <<MI Kh 0 (atzT) da + ~j e iccz Kh 0 (am) da 

/* 00 

= J Kh 0 (atcr) cos azda, 


(14) 


by using the relations 21-022 (67) between Kh 0 and the Hankel functions. This type of 
transformation is frequently used in the treatment of waves over plane boundaries, and it is 
well to have it in its simplest possible application. A modification for spherical boundaries 
is the basis of much work on the propagation of electromagnetic waves over a sphere. 


21*101. Fourier-Bessel integral. Subject to conditions similar to those for 
Fourier’s integral theorem a function of position over a plane can be expressed in 
terms of Bessel functions. If <f>{p,x) is the potential at Q, whose cylindrical coordinates 
are (p, x, 0), and P is (m, A, z), where z > 0, 

= J/Jo l f‘(P.X)e-”J a (f:q)dSdK, (1) 

where q is the projection of QP on the plane 2 = 0, and therefore 

S' 2 = p 2 + U7 2 — 2pm cos (^—A). (2) 

We wish to have an expression in terms of e~ KZ J n (Km) (cos nX, sin nX), the typical solutions 
of Laplace’s equation in cylindrical coordinates. The reduction of J Q {ieq) to this form is 
unexpectedly difficult and was apparently discovered by Neumann and Heine as a 
limiting case of the corresponding result in spherical polar coordinates. The expansion 
required is ^ 

J 0 (ieq) = J q (kw) J 0 (Kp) + 2 £ J n (Km) J n (xp) cos n(x - A). (3) 

»—i 
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To verify it, we substitute Schlafli integrals for the Bessel functions and choose the paths 
so that the variables of integration have moduli > 1 at all points of the paths. Then 


J n {Kw) J n (Kp) cos n(x - A) = ~ 4~2 j* 


py^Kxnia—l/a) 

m a n+ 1 J. 


| e y2Kp(fi-vfi)J£. 
M P n+X 


and 

Then the series is 


1 + 22 

n = l 


cos n{x — A) 


doc f 
a 2 fi 2 -1 


cosn(x-A) (4) 
(5) 


s - -&!u ?)) 


a 2 /? 2 — 2 a/? cos (x ~ A) +1 * 

a 2 /? 2 —1 doc dft 

a 2 /? 2 — 2a/? cos (x~ A)+ 1 ocfi * 


( 6 ) 


and since the integrand is single-valued we can replace the paths by any closed contours 
such that I a I > 1, I /? I > 1. Write 

X~X = d, a = cr//?, (7) 

where | cr | = | ocfi | > 1. Then the index of the exponential is 

= say, 

and 8 = ~ i^IJ 0 (exp f) - £^V +i 

We can now integrate with regard to /? and get 


(9) 


s - LSA xl f- pm M1 


cr— 1 far 


d(T 


cr —2cos#+ 1/cr cr * 


( 10 ) 


The path can be taken to be any circle of radius > Iso as to enclose the poles at cr = exp( ± id-). 
Now if we put cr = l/<r', and study the changes in sign, the form is unaltered, but the 
new path is a circle c of radius less than 1 traversed in the negative direction. Thus we 
have also, taking c now in the positive sense, 


8 - Ai!A K [ p2 ~ pm (A) +w f]^ 1/<r 


dcr 


cos#+ 1/cr cr * 


( 11 ) 


and by addition 2 8 is simply the sum of the residues at the two poles. This is evaluated 
immediately and gives 

8 = J 0 {k(p 2 — 2mp cos d + tu 2 ) 1/2 } = J 0 {tcq) (12) 

as was to be shown. Hence 

0P = - ^ JJJ <f>{p, x) e~ KZ j q{ k P) + J n(xp) cos n(x - A)J pdpdxdK. (13) 

The differentiation with regard to z can be done under the integral sign, and we have the 
expansion required. The reversal of the order of summation and integration is permissible 
in the same sort of conditions as for Fourier’s integral theorem (applied, of course, in two 
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dimensions). The proof of convergence for the limiting integral over the surface obtained 
by putting z = 0 at the start requires additional conditions, as for Fourier’s theorems; 
if it does converge, it is equal to A, 0). What is called the Fourier-Bessel expansion 
theorem is then 

J |*oo /*2 it P<x> 

4>{w> A) = — J o J o J o X) j q{kw) J 0 (/cp) KpdpdxdK 
00 1 r°° C 2n 

+ n?i7T J 0 J 0 J 0 ^ P, ^ J ^ KW ^ J ^ K P^ COSn ^~ X ) K P d P d XdK. (14) 

It should be noticed that when p is large, J n {Kp) is of order p~% and the absolute con¬ 
vergence of the integrals requires that <f>(p, x) -► 0 faster than p- 8 ^. 


21*102. Expansion between concentric circles. If u and v are functions of x 
satisfying 

x 2 u" + xu' + (A 2 # 2 — l 2 )u = 0, 


x 2 v" + xv' + ( fi 2 x 2 — m 2 ) v = 0, 
we multiply by v/x, ujx respectively and subtract; then 


and by integration 


x{u"v - uv”) + {u’v - uv') + j (A 2 - p 2 ) x -uv = 0, 
I |(^ 2— P 2 ) x -- —\uvdx = —[x(u'v — uv')]. 


( 1 ) 

(2) 

(3) 

( 4 ) 


/• 


The boundary conditions are usually such that the terms on the right vanish there. If 
then A = p, Z + m, 

dx . 

„ Uv ^ = °> (5) 

and if A+p, l — m, 

jxuvdx = 0, 

the limits being any values of x where the terms on the right of (4) vanish. In particular 
if the limits for x are 0 and a, and A and p are two different quantities such that 


then 


J m {Xa) — J m (/ia) = 0, or J' m (Xa ) = J' m (jua) = 0, 
J q z/JAz) J m (pz)efc = 0. 


(7) 


This might have been expected from the general orthogonality relations inferred from 
Green’s theorem. To determine the coefficients in the expansion of a given function we 
need also the integral of xv 2 , where v is any solution of (2). Multiplying the equation by 
v' we have 

J 

0 = x 2 v'v n + xv' 2 + {fi 2 x 2 - m 2 ) vv' = {\x 2 v' 2 + \{p 2 x 2 - m 2 ) v 2 }-ju, 2 xv 2 , 

and therefore, between any limits, 


J xv 2 dx = 


— 2 [x 2 v' 2 + (p 2 x 2 - m 2 ) v 2 ]. 


( 8 ) 


JMP 


38 
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EXAMPLES 


1 . If 

x*^(+nx^+(&+cx* m )y = 0 
ax * ax 

and 

£ = x m , y = x -1/a(n ~ 1) rj. 

prove that 


where 

ju 3 m l = i(n—l) a — 6 . 

2. If 


prove that 


and hence show that the Airy integral is a multiple of x^Kh%($x % ), 

3. Express Bi(x) in terms of Bessel functions of order ± 


(Lommel.) 


(Nicholson.) 

(Miller.) 






Chapter 22 

APPLICATIONS OP BESSEL FUNCTIONS 

22*01. The majority of applications of Bessel functions are to vibrations of systems 
with symmetry about an axis; the z coordinate usually either varies little, as in tidal waves 
on circular sheets of water, or the dependent variable is independent of z. Even if it 
involves z, Bessel functions usually provide the best treatment if the boundaries are 
planes of constant z. Bessel functions of order half an odd integer, in combination with 
Legendre functions, arise in problems of vibration for spherical boundaries. They also 
occur in various one-dimensional problems, notably the oscillations of a light string 
loaded with heavy particles at regular intervals, and the transmission of electric waves 
in a submarine cable. 


22*02. Cylindrical pulse. Consider the explosion problem of 19-08, with the 
modification that the original excess pressure P 0 is within a cylinder of radius a instead of 
a sphere. With analogous initial conditions the subsidiary equation is 


10/ 00\ p 2 P 0 p 
m dm \ dm) c 2 ^ ~~ pc 2 ( TZ7<a )> 

( m>a). 


= 0 


( 1 ) 


The complementary functions are the Bessel functions of order zero, I 0 (prufc) and 
Kh 0 (pm/c). The latter is inadmissible within the cylinder because it is infinite when m = 0. 
The former cannot occur outside it. For the interpretation is to be an integral through 
values of the variable with positive real parts, and when rn is great the asymptotic expan¬ 
sion of I 0 {zru/c) contains exp ( zru/c ) as a factor. Hence the solution would give a pulse 
travelling inwards. The solution is therefore 


= BKh 0 ^j 

Also d<p/dt and d<f>jdw must be continuous at t u — a. Hence 


We have the identity 
from 21-07(6). 

Hence for a 


Mo 




I' Q (x) Kh 0 (a;) - I 0 (x) Kho(z) = 2/nx, 



(3) 

(4) 

( 5 ) 

( 6 ) 
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Cylindrical pulse 

22*02 

But 


1 C n 

I 0 {x) = - I exp {x cos 6) dd, 

(7) 



2 f 00 

Kh 0 (a;) = “ exp ( — xcoshv)dv, 

7TJ 0 

(8) 

and therefore 

.‘Bli 

II 

— f f cos0 exp (— cos 6 — — cosh v\dddv 



~ 

— f |* cos d H (t + ~ cos0 — — cosh tA dddv. 

TTCjoJo \ C C ) 

(9) 

Since 


Cb « TJJ Cl TD 

t + - cos a -coshv<£-|-, 

c C c c 

(10) 


0 vanishes at any place up to time (m—a)/c; and if we integrate first with regard to v, 
we can replace the upper limit by cosh -1 {{ct + a cos 6)Jm] and the unit function by 1, 
provided ct + a cos 6>m. This will be true at least for 6 = 0 if ct>m —a. Hence 


^ = - — f cos Q cosh - 1 ( d ± fl - C ° S - )dd {ct>w — a). (11) 

P 0 ttcJ o \ m J y 

If ct>m + a, {ct 4- a cos 0)/m > 1 for all 6, and the upper limit is n. If m—a<ct<m+a, the 
upper limit is cos -1 {m — ct)ja. The disturbance can therefore be divided into three stages, 
according as ct<m — a, m — a<ct<m+a, and m+a<ct. In the first stage 0 = 0 and we 
have a cylindrical pulse travelling outwards with velocity c. 

We are interested chiefly in the pressure. This is given by 

P _ a cos 6dd 

P 0 7T J 0 {{ct + a COS d) 2 — TZ7 2 } 1/s ' 

Put for w —a<ct<m+a, 

ct + a — m = 26, ct + a cos 6 — m = 26 cos 2 rjr 


( 12 ) 

(13) 


and suppose 6 small. Then soon after the arrival of the pulse 


P 

Po 




(14) 


The increase of pressure on arrival is therefore ^P 0 {a/i u) 1/2 , as against \P Q in the corre¬ 
sponding one-dimensional problem and £P 0 a/t v in the three-dimensional one. The decrease 
with time is at first proportionately slower than in the three-dimensional problem, and 
P is still positive when ct — w. But it tends to — oo at ct = m+a and returns to finite 
negative values for greater values of t. The approximate value near ct = m + a m* 


and when ct — m is large compared with a is 

P _ 1 a 2 ct 

P 0 2 {c 2 t 2 — TZ7 2 ) 3/a ’ 


(15) 


( 16 ) 


Jeffreys, Proc. Camb. Phil . Soc. 39, 1943, 48-51. 
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This infinite disturbance of pressure does not imply infin ite energy, since the infinity is 
only logarithmic; and it will in any case be modified by the inclusion of second-order 
terms in the hydrodynamical equations. The instantaneous release of the whole of the 
surface of an infinite cylinder would be difficult to arrange physically, but an approxima¬ 
tion to it would be possible if the interior was filled with an explosive mixture and the 
velocity of the wave of combustion in it was several times the velocity of sound in cool 
air. The indefinitely prolonged tail of the disturbance is characteristic of two-dimensional 
propagation. It occurs also for a point source between two parallel plates and in the forma¬ 
tion of elastic waves in a solid;* surface waves are formed by diffraction at the boundary, 
spread out in two dimensions, and give at any place only an asymptotic return to the 
original position, in spite of the fact that the original disturbance may be of finite extent 
in all three directions. 


22*03. Light string with concentrated loads. We have seen that the operational 
method is universally valid for the treatment of a properly specified finite set of linear 
equations, and does not need the use of Bromwich’s integral. We have suggested that 
continuous systems are best regarded physically as derived from discrete systems by a 
limiting process, and the solutions found for them as the limits of the solutions for the 
discrete systems. It is desirable, therefore, to have a concrete example showing how the 
hind of operator that arises for continuous systems can also arise as the limit of a sequence 
of operators applicable to discrete systems. One such example is provided by the uniform 
stretched string under tension P = pc 2 , with mass p per unit length. If we replace this by 
a light string under tension P, with particles of mass pi at intervals Z, we have a discrete 
system with the same average mass per unit length, and we can approach the uniform 
string as a limit by taking l indefinitely small. The equation of motion of a particle is 

c 2 

y* £2 (tyr ~ Vr-1 ~ y r +l)> ( 1 ) 


which reduces, on putting x = rl and letting l->0 with x fixed, to 


d2 y _ od 2 y 

az 2 a* 2 * 



Suppose that the system starts from rest, that the particle with r = m is kept fixed, and 

that y 0 is made to vary with the time in a prescribed manner. Then the subsidiary equa¬ 
tions are J H 

/ 2 ,2c 2 \ e« 

P + !?) Vr = J 2 ^ r ~ 1+Vr+ J ( 0<r<m )• (3) 

These can be solved formally by putting 


then 


y r = Ae r *; 

p 2 l 2 

~ 2 ~ + 2 = e A + e A = 2 cosh A 

C 


and there are two equal and opposite real values of A for real p, 


* Lamb, Phil. Trans. A, 203, 1904, 1-42. 


( 4 ) 

(5) 
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Then if 


Wave expansion 

sinh£A = g 


2203 

( 6 ) 

(?) 


(5) is satisfied; and if we now take the positive root, and take 

sinh (m—r) A # 

Vr = sinh mX ' Vo ’ 

all the conditions are satisfied. But sinhsA/sinh A is a polynomial in sinh f A of degree 
2{s — 1). The operator is therefore a rational function of p, and its expansion in descending 
powers of p starts with (c/Zp) 2r . It follows that the further a particle is from the disturbed 
end the more gradually it will begin to move.' But also we can expand in exponentials 


Sinh (m r) A _ ^Xf \ _ g-2(m-r)A\ n e -2mA e -4 m\ -f ...). 
sinh wA v 


( 8 ) 


For if we replace p by z and A by £, with SR(z) > 0, then fit(£) > 0, and | e * | < 1. The first 
term inis then w ... 

W - = (l + (4? + 1 ) ) ym ( ) 

which again is expansible in negative powers of p and satisfies our fundamental rules. 
Further, if Ir - x and l-+ 0, it tends formally to e~ pxlc , the operator characteristic of waves 
in a uniform string. It does not lead to an interpretation of e px l° by a limiting process 
because if we change the sign of r or p we get an expression that is not expansible in 
negative powers of p. 

The physical string, however, has a molecular structure, and we are concerned to know 
how closely the solution for the continuous string approximates to that for the actual 
string. For this purpose we take y 0 = H(t ); we want to see whether w r -> II (t—xjc) when 
rl — x is fixed and l is small. Then 


1 f JI-, zl\~*dz 

Wr = 2riJ L e (( 1 + 4 ^) + 2c) 7 ’ 

and we use the method of steepest descents. Put 

it z 2 l 2\ 1/a zl\ 
Mz) = zt-'. 2rlog{(l + i? ) + rc }; 

then 


rl 


VW- 1 c^/{l + z 2 i 2 /4c 2 } : 

... . rzl 8 

<f>"(z) = 


( 10 ) 

( 11 ) 

( 12 ) 

(13) 


4c 3 (l + z 2 Z 2 /4c 2 ) 8/a * 

If rl - x, x/ct = £, the saddle-points are at z = ± {2c/l) (£ 2 - l) 1/a and therefore are on the 
real or the imaginary axis according as £ is greater or less than 1. 

If £ > 1 we find 

which tends exponentially to 0 for given £ifZ->0. If £ — 1 is small, put £ = cosh u y v = ctu z jl 
and consider values of £ such that v is large. We find that if 

v = 6, w r = 0-0021, £ = 1 + £(i>Z/c*) % . 
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Hence if ct/l is large w r is negligible if x exceeds ct by an amount that increases only like 
t /# , and tends to zero with l. lfl= 10~ 8 cm., ct — 10 cm., v = 6, we have f — 1 = 1*7 x 10 -8 . 
With ordinary magnitudes, therefore, the motion is negligible at distances so little in 
advance of the ideal pulse that the continuous system gives as good an approximation 
as we should ever need. 

If x<ct > the 111168 of steepest descent through the saddle-points proceed from and to 
-°°; obvi o u 8ly they cannot approach + 00 , since the integrand tends to infinity there. 
Hence they are not together equivalent to the path L, since the pole at z = 0 lies in 
between. We must therefore include a loop from -00 about the origin, and this makes a 
contribution 1 to w r . We find then that for x<ct 


w. 


/ l £ r*2 ct “I 

V ~ 1 “ \mt) (l^T|5)* 008 |_T {(1 ~- f C08 *‘9+ i"J • (16) 

When Z-> 0 this tends to 1, as we expected; but the correction term tends to 0 only like 
l 1 '* instead of exponentially, its wave-length tending to zero with l. However, with 
l(ct= 10-«, 1-£>5x 10-«, the term never exceeds 1-7 x lO' 2 and diminishes’ with 
decreasing £. The sharp front followed by a constant displacement is therefore a good 
approximation. 

The change of phase of the correction term from one particle to the next is small if £ 
is only a little less than 1, but approaches it if £ is small. We therefore have dispersion. 
The disturbance can be regarded as including all possible wave-lengths > 21; the longest 
have group-velocity c, but the shortest group-velocity 0. This can be seen by returning 
to (10) and writing the first two factors as expi(yf — kx), with 


. , yl 

z — ty, rl = x, — = sin0, k 


20 / 1 . 


The wave velocity is then 


7 _c sin# 
k T~' 


(16) 

(17) 


and values of Q between 0 and \n are admissible. (Larger values would give the same 
displacement where x ranges through exact multiples of l and are therefore irrelevant.) 
The wave velocity therefore ranges from c to 2c/n, and the group velocity is 


dy 

V- = c cos 0, 
dK * 


(18) 


which ranges from c to 0. 

Suppose now that the string extends on both sides of the particle specified by r = 0, 
and that instead of the motion of this particle being prescribed it is given an initial dis¬ 
placement u and then released. Its subsidiary equation is 

/ 9 2c 2 \ c 2 

+-jz]yo-j i {yi+y-i) =p 2 u, ( 19 ) 

and if the time is short enough for waves reflected at the ends not to have arrived 


Vi = y~ i = e_A y 0 , 


_ pul / 2c 
^ cosh * 


w. 


_ pul 12 c (pi / pH 2\V2|-2r 

~(1+^ 2 /4c 2 )V 3 (23 + \ 1+ j * ( 2 °) 

( 21 ) 


from 21-01(23). 
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A Bessel function takes its maximum value when the argument slightly exceeds the 
order, so that we see at once that the greatest displacement travels out with velocity c, 
and from 21*05(5) that when ct is large compared with rl the motion of a particle is 
oscillatory, consecutive particles differing in phase by amounts approaching it. The rapid 
variation of the phase of the movement suggests an analogy with heat conduction, but it 
is quite systematic, and the essential property of heat conduction is that the variation is 
not systematic. Consecutive particles in random motion are as likely to be in the same 
phase as in opposite phases. A little further examination shows that the analogy breaks 
down in another respect.* If we consider random initial displacements and velocities 
given to all particles in a finite length of the string, the energy is found to spread out so 
that the length of the string that contains a given fraction of the initial energy increases 
in proportion to t. In heat conduction the length in question would increase like £ 1/a . 

There is a considerable change if there is any irregularity in the structure of the system 
itself. Let us assume a harmonic wave train coming from negative x, but that the particle 
specified by suffix 0 has mass pl(l+a) instead of pi, where a may be small. There will be 
a reflected wave; we therefore take 


y = exp i(yt -kx) + A exp i{yt + kx) (xjl < 0), 

y = B exp i(yt — kx) {xjl ^ 0). 

Then 1 + A = B. 

Also the equation of motion for this particle is 

- (1+ a) y 2 + -J 2 -J y Q = (y_i+ y +1 ), 

and substituting from (22) and (23) we find, with kl = 20, 

B ~ l + iatan0* 


( 22 ) 

(23) 


(24) 


(25) 


If 0 is small, corresponding to long waves, B is practically 1 and there is nearly perfect 
transmission. But if 0 is nearly \n, corresponding to the shortest waves possible, there will 
be nearly perfect reflexion even if a is small. Thus even a slight irregularity of structure 
will practically destroy the tail of a wave train. If there is a disturbance between two such 
irregularities much of the energy will be reflected several times before it gets past either, 
and a number of minor irregularities will give an irregular motion closely resembling 
thermal agitation, with a slow leakage resembling conduction. In an actual solid we have 
a three-dimensional form of the same problem, the irregularities arising from random 
motions of electrons even in a crystal and from local departures from a regular pattern 
in a glass. 


22*04. Diffusion as a limit. It is strictly meaningless to speak of the temperature 
at a point, since the temperature expresses the mean energy of random motion of a 
number of particles; if we speak of the absolute temperature as specified within a factor 
of I0 -3 we must be considering something of the order of 10 6 particles. In the strict mathe¬ 
matical sense, therefore, the space derivatives of the temperature do not exist. But if l 
is sufficiently large for the difference of temperature between two places l apart to be 

* Jeffreys, Proc. Cartib. Phil. Soc. 23, 1927, 775. 
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considerably more than its uncertainty we can convert the equation of conduction into 
a finite difference equation, the one-dimensional form of which is 

% = ^(yr-l-tyr + yr+l)- W 


We take y 0 = H(t) and solve operationally. The analysis is very similar to that at the 
beginning of the last section, p replacing p 2 , and we get as solution for the inward diffusion 
corresponding to (9) of 22*03 


«V = 



( 2 ) 


which is expansible in powers of p — h If l is made to tend to 0 while rl tends to x, w r tends 
formally to e~ plhx!h , that is, to our e~^ x . Thus we have obtained the latter operator as the 
limit of one expressible as a power series in p -1 . The difference (v) 1 — w 0 )jl yields a deriva¬ 
tion of p 1/a , namely 


Now 


p 1/a = Alim 
i-*o 


IV_ = -- 

2m 


i[-K 


i + ?P) 
+ 4A 2 / 


Va 


♦«n* 


i+ 




4A 2 ) '' 2A 




( 3 ) 


( 4 ) 


By Dalzell’s theorem (12*101) we can reverse the order of integration and passage to the 


limit; then 


limw = . 

i-+ o 2 tu 



z lh x\ 

~h~) 


dz 

z 


1 —erf 


x 

2W*’ 


( 5 ) 


from 12*126. 

When l is small but not zero the saddle-point is slightly displaced, but we may take the 
path to be the path of steepest descent for the integral in (5). On this path we can expand 
in ascending powers of l, for the main contribution comes from the neighbourhood of the 
saddle-point, and an expansion exists if l is small. Then 


W 



z z ^xl 2 

24F 



zt 


_ z lk x\ 

~“S"; 


dz 

z 


■( 

-( 


xl 2 d 3 \/ 1 , * \ 

24 dx 3 / \ erf 27 ^/ 2 / 

u 


1 —erf 


xl 2 


2ht 1/2 / 48A 3 N /(7r7 3 ) 



( 6 ) 


If x/2ht l h is large both terms are small irrespective of l. The important values of x are of 
the order of 2A£ 1/a or less. At these the last term is of the order of l 2 j 48A 2 i of the first, so that 
it can be neglected if t p 7 2 /48A 2 . 

If we take l = lO -5 cm., A = 0*1 c.g.s. (a value for a bad conductor), the critical value 
of t is 2 x 10 -10 sec. Thus for a short time after the initial disturbance of temperature the 
usual solution is invalid because the temperature itself is meaningless; but this time is 
very short and would be shorter for better conductors. The approach through the finite 
system therefore confirms the results obtained by treating the continuous case directly 
and answers the logical objection to that method. 
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22*05. The submarine cable. This is a uniform conductor with self-induction, 
capacity, resistance, and leakage. At distance x from the end suppose that the charge per 
unit length is y, the potential <j>, and the current J. Then the followin g equations hold: 

IdJ 
dt 
dy 
dt 

l, k and r are the self-induction, capacity and resistance per unit length. The leakage is such 
that a potential produces a current 8<j> per unit length leaking away through the insula¬ 
tion. Up to t = 0, y, J, and <j> are zero; afterwards 0 is raised to 0 O , which may be a function 
of t, at x = 0. Then the subsidiary equations are 


< 

'■Si 

II 

(1) 


(2) 

dJ , 



(3) 


<&+')'~ I- 


(k P+i 00 


whence 


= (lp+r)(kp + s)<f>. 


(4) 

(5) 

( 6 ) 


P ut (Ip+r) (kp + 8) = q 2 = lk{(p + p ) 2 - cr 2 }. 

Then the operational solution, neglecting reflexion from the far end, is 

0 = e-< x <J) Q . (7) 

If self-induction and leakage are negligible we have l = 0, 8 = 0, q 2 = krp. Then the 
solution has the same form as for conduction of heat. This condition occurs in ordinary 
telegraph wires. If in this case (f> 0 = H(t), 

, r x (krV^ 

S 6 = 1 -erf 1 ( T j . (8) 

For given x, <f> -*■ 1 for large t, and the approach is quicker if Jcr is small. For fairly short 
lines the time needed is short enough for successive signals to be transmitted without 
overlapping. But for long ones, and especially for submarine cables, the time needed to 
built up the requisite potential at the receiving end is long enough to interfere seriously 
with the practicable speed of signalling. Reduction of hr means thicker conductors and 
therefore prohibitive cost. The modification introduced by Heaviside was the deliberate 
introduction of self-induction and leakage far above what the simple cable possessed. The 
principle of the self-induction can be seen from the rough analogy of a projectile thrown 
through air. If it has negligible mass the resistance damps down the motion quickly and 
it does not travel far. But a heavier projectile, though it needs more effort to give it the 
a&me velocity, keeps its velocity better and travels further. Self-induction acts in much 
the same way as inertia. The effect of leakage is less obvious, but will appear in a moment. 
We put „ 

Ik = 1/c 2 . ( 9 ) 

Since p+<r — r/l and p—a = s/k are both positive, p is positive. Then 0 < 

If cr = 0 we have simply 

0 = expj — (p+p)^ 0 

= e- pxlc <f> 0 


cr\<p. 


( 10 ) 
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Hence the variation of <j> with time at distance £ is an exact copy of that at the trans¬ 
mitter except for the delay xjc and the constant attenuation factor e~P x,c . The velocity c 
is very high and the attenuation can be compensated by amplifiers at the receiving end. 
This arrangement is the distortionless cable, and is achieved if Is = hr. 

If o' is not zero, we take <fi Q = H(t) and the solution is 

0 = exp^-*{(p+p) 2 -<r 2 P»J = ±j^exp^zt-*{(z+p) 2 -(r 2 }^j. (11) 

The current is 

J - - i ip+ l p+<T) fx = S {( p +p)^^}v exp [~^ (p+ ^- ,rTi ]- (12) 

First omit the term in p—cr from the first factor. What remains is 


exp --{(z+p) 2 -<r z yh + zt dz 

r = — —L | _L_ - _ A _ 

lc2niJ L {(z + p) 2 - cr 2 } 1 '* 

l f exp “ ~ c ^ ^ 

~2mlc eP J L (^-cr 2 ) 1 /* 


Put £ = £cr(u + 1/w); then 

T 1 , f .[ t t xu x\du 

1 - jet* « 

«- xlc>0) ■ 

The part omitted follows by integration; we have 

For <j>, notice first that 

exp [-^{(p+P) 2 -o- 2 } 1/a ] = e~<P+/» x/c ex p c + "} 


(13) 


(14) 

(15) 

(16) 


the second exponential being in descending powers of p+p; hence <f> is zero up to time 
xjc and then jumps to e~ px,c y afterwards varying continuously. Secondly, we can take the 
ter mini to have real part — oo, and then differentiate with regard to t; then 


I-^LH^ {(2+p),-<r2 >i <fe 


K-iSW]] 


<re~ pi x/c 


(t 2 -x 2 jc 2 




( 17 ) 
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But we know ^ just after t = x/c; hence 


2206 


{t>x,c) • (18) 

22*06. Line source of heat: line vortex. The equation of heat conduction, for the 
case of symmetry about a straight line, is 


0F_A_ a _a_ 

cb m dm 


B9=°. 


(i) 


The same equation is satisfied by the vorticity £ in a viscous liquid when the motion is 
in circles about an axis,* h 2 being replaced by v, the kinematic viscosity. Take first a 
concentration of heat k per unit length along the axis at t = 0; the operational solution 
will be 

V = ^4Kh 0 (gu7). (2) 

A being a function of t. If pc is the heat capacity per unit volume the excess heat wi thin 
distance m of the axis, per unit distance parallel to the axis, is, 

J ro rw 

pcVmdw — 2 nAj pcwKh 0 (qm) dm 

— — 2irApcq~^qwKk 1 {q / uy)\ . 

When t->0 this must tend to k for all m> 0; but 


L 


Jo 


and 

Hence 

and 


Hm [$wKhi(gw)] = lim ^{qmKh^qm)} = 0 
<-»-o q q —> co q~ 

2 

lim = -. 

TT 

k = 4A.pcq ~ 2 

V = — q*KhJqm) = — e _. 

4pc y m ' 4 pc n tTiH 


(3) 

(4) 

(5) 

( 6 ) 
(7) 


Similarly if k is the original circulation about the axis and we put p — vq* we get (2) 


for £ instead of V , and the circulation is 2 tt j Qmdm. Proceeding similarly we find j - 

r — g-w*/4y< 

b 4 nvt 

The circulation is x(l -e~® s/4W ) and the velocity is therefore 

K 


2nm 


(1 — 


* Lamb, Hydrodynamics , 1932, p. 691. 
f Goldstein, Proc. Lond. Math. Soc. (2) 34, 1932, 62. 
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EXAMPLES 


1. An elevation of the form sin/ca; on the surface of a highly viscous fluid can be shown to de¬ 
crease with time like exp (— AgtjvK). Show that if the initial elevation is ± J according as x < 0, 
the elevation at any later time is 

3 Kh 0 {2e-y* Jri ^/(Agxt/v)} 
and that the discontinuity at x = 0 persists. 


2. A unit e.m.f. is applied at t = 0 to the sending end of a non-inductive line of resistance R, 
capacity G, and leakage G per unit length. Show that the current I at the sending end at time t 
is given by 


where A = (7/(7. 


1 = 




r 

J 0 


94 


(I.C. 1943.) 


3. The curved surface and the base (z — 0) of the cylinder xu = a are maintained at zero 
temperature. The other plane end z = 6 is maintained at temperature T. Prove tha t the steady 
distribution of temperature is given by 

2T ® 1 ainh/i n z J 0 (/i n m) 


V = 


where fi u /«,... are the zeros of J 0 (/ia). 


a 2 n = i^sinh/i w 6 J^a) * 


(I.C. 1939.) 


4. A uniform chain of length l and weight w hangs freely from one end, and makes small 
oscillations. Calculate the length of the simple pendulum equivalent to the slowest mode of 
vibration * (I.C. 1938.) 






Chapter 23 

THE CONFLUENT HYPERGEOMETRIC FUNCTION 


‘All changes trying, he will take the form 
Of ev’ry reptile on the earth, will seem 
A river now, and now devouring fire; 

But hold him ye, and grasp him still the more.* 

homer, Odyaaey (Cowper’s translation) 

23*01. The hypergeometric function is defined in general by the series 


F(a,b;c;z) = l+^z + 
J. • c 


a(a+1)6(6+1), a{a+ 1 )(a + 2)b(b+ 1 )(b + 2)^ , 

^ I a | / v _ r w | 


2!c(c+1) 


3!c(c+1) (c + 2) 


( 1 ) 


and satisfies the differential equation 


2 ( 1-2 )^z 2 + {c-(a + b+l)z}-^-abu = 0. (2) 

A second solution is z 1-c F(a— c +1, b - c +1; 2 - c; z). (3) 

It can be shown by direct transformation that any second order differential equation 
with three regular singularities (one of which may be at infinity) and no other singularities 
can be reduced to this form, and that all the solutions about them can be expressed in 
terms of the hypergeometric function. The function has a large literature and several well 
known functions can be expressed in terms of it. We notice at once that if 6 = c it reduces 
to the binomial series, and if a — 6 = 1, c = 2, it gives the series for —z -1 log (1 — z). The 
series expressing the Legendre functions in terms of argument 1 — x is also of this type. 

If c is not an integer both series are significant for | z | < 1. There are several complica¬ 
tions of the types discussed in Chapter 16 if c is an integer, positive, zero, or negative. 
If we put 

1 

x = - 
z 


the singularities of the transformed equation are at oo, 1, 0; and if we put 


x = 1 —z 


the singularities are at 1, 0, oo. By successive applications of these transformations we 
can express the equation in terms of any of the independent variables 


2 , 


1 

z’ 


1 z— 1 z 

r=v “* 


and in each case it retains its hypergeometric form, and there are two solutions expressible 
in terms of hypergeometric functions of the variable used. The equation therefore has 
12 solutions of the forms (1), (3), any of which can be expressed in terms of two funda¬ 
mental solutions. All have radius of convergence 1. Twelve more can be obtained, of the 
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types (1 - F(c—a,c — b;c;z) and 2 1 ~ c ( 1 - zf^ 1^(1 - a, 1 - 6 ; 2 - c; z ). Each of these 

is equal to one of the original twelve. 


23*02. Series End differential equation for the confluent hypergeometric 
function. If we put bz = x and then let b tend to infinity the function becomes the series 


% = /i( a > 7 > z) = M(a, y, x) = 1 + - x + 

7 217(7+1) 


( 1 ) 


In this notation the above hypergeometric series would be 2 F 1 (a,b;c;z), the first suffix 
denoting the number of factorials in the numerator of the general term, the second the 
number, apart from u\, in the denominator. The notation can be extended to series 
containing any number of factorials in the general term; such series are known as general¬ 
ized hypergeometric functions. The Bessel functions come under the type qF v and can 
evidently be derived by putting ocx = y and then making a tend to infinity. 

Evidently 1 i^(a, 7 , x) is an integral function. It satisfies the differential equation 


dhc 


du 


Another solution is found to be 

u 2 = x 1 -? 1 F 1 (l+a-y,2-y,x) = x 1 -r+ 2 ^ af+i~r 

r=i H( 2 -y)...(r+l-y) 


( 2 ) 

(3) 


except possibly when y is an integer. Hence there are two independent series solutions 
except possibly if y is an integer (positive, zero, or negative). If y = 1 , ( 1 ) and ( 3 ) are 
identical. 

If y is an integer ^ 2 all terms of (3) from r — y — 1 onwards have a zero factor in the 
denominator, and (3) will not be a valid form of solution unless there is also a zero factor 
in the numerator, that is, unless a is an integer such that 1 ^ a ^ y — 1 . 

If y is zero or a negative integer, and r = 1 -y, all terms of ( 1 ) from af onwards have 
vanishing denominators, and ( 1 ) will not be a valid form of solution unless a is an integer 
such that y < a ^ 0 . 

Hence for certain special values of a there are two series solutions even if y is an integer 
different from 1 . A terminating series can be found in each case. 

If y is not an integer, and a is a negative integer or zero, ( 1 ) terminates; if a —y is a 
negative integer, (3) terminates. 

If y > 0 , ( 1 ) is always significant. We shall see that ( 1 ) can also be expressed in terms of 
a complex integral, and that all solutions of ( 2 ) can be expressed in terms of the integral 
used for ( 1 ), with suitable changes of the termini, just as all solutions of Bessel’s equation 
can be expressed by changes of the termini in the complex integrals used for J (#), where 
M(n)>0. 

Since the function depends on three variables it is practically beyond the reach of 
tabulation in general. If a function of one variable takes a page to tabulate, one of two 
variables will take a book, one of three variables an ordinary sized room of bookshelves, 
and one of four variables a large library. Consequently the theory of this function, and 
still more of the hypergeometric function, is mainly a matter of general propositions with 
detailed application to a few special cases. 
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23*03. Complex integral solutions. Complex integrals representing the functions 
can be obtained at once by operational methods. We write 


xy~ 1 1 F 1 (cc,y,x) = 2 


a(a+l)... (a+r — l)(y — 1)! a^ +r_1 


= (y-i)iS 


r\ (y + r — 1)! 

a(a+1)... (a + r— 1) 


r! 


^-K-i) =(r _ 1) |^ 1 _Ij >-r 


(4) 


and similarly 


(y— 1 )! e sx dz (y — 1 )! 

f e zx dz 

2 ni J & (1 —1/^0* 277t J 

| iZ y-«( z _l)« 

■r x F x (l + <x-y t 2-y,x) = (1-y)! 

/ ] \-0+«-y) 

\ PI 

(i— r)! 

f e zx z a ~ r dz 


2 m Ji(2-l) 1+a_r ’ 


(5) 

( 6 ) 

(7) 


These are to be understood in the first place as valid for x real and positive; and then we 
have by analytic continuation, putting zx = A, 


X F X (ct,y,x) 


(y—1)1 f e A A*~r 
2ni J m (A — x) x 


i-fi(l +a-y,2-y,x) 


(1 -y)!f e A A“~i 

2ni Ji f (A-*) 1 +-^ OA 


( 8 ) 

(9) 


valid for all a:. If the factorial factor is omitted, both are intelligible even if y is an 
integer. If we put A = ac + ce in (8) we get 


«i = 7, *) 


(y- l V- c ,( 

2m ] M{^ + x)y- <3t 


d< = e x x F x {y- 


a,y, 


-*)» 


( 10 ) 


and similarly = a; 1 "’' 1 i? T 1 ( 1 + a - y, 2 - y, a;) = x 1 ~^e x X F X ( 1 - a, 2 - y, - a;). (11) 


(10) is very useful if y —a is an integer < 0, since the last factor is then a polynomial. 
Similarly (11) is useful if 1 - a is an integer < 0. Hence if either y-a or a is an integer, 
irrespective of sign, one of the series solutions reduces to an elementary function. Many 
recurrence relations exist, analogous to those for the Bessel functions.* 

Also 


d_ 

dx 


x F x (cc } y,x) = 


(y—l)!af e x X a ~r 


2m 


( e} 

Jm(A — 


x) a+1 


dX 


= (y y 1 , )1 V i(«+i.r+M) 

-^(a+l.y+l,*), (12) 

y 


as is also obvious by differentiation of the series. 

If we put ax— y and then let a tend to infinity, the integrals tend to the forms of 
Schlafli’s for the Bessel functions, apart from some simple factors. 


* B.A. Reports (Committee for the Calculation of Mathematical Tables), 1926. Tables also in 
1927 Report. 
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23*04. Asymptotic formulae of Stokes’s type. 23-03 ( 8 ) represents an analytic 
function for all values of x, real or complex, but cuts will be needed if a and y are 
non-integral. If x has a positive imaginary part we can replace the path M by a pair of 
loops about the branch points, and we can show that each loop separately gives a solution 
of the differential equation. Take any integral of the form 


r e x \ a ~v 


(i) 


where the limits are independent of x. Then 

dH 


'dx 2 


+ (7 X) dx ai ~J eA ((Tl^pa ■+ (A"*)-»" (X^j dK < 2 > 


In the numerators replace a; by A — (A —a:) and rearrange in powers of A —a;; then 

a; A 1 


a(a+l)# + a(y-x) 


(A —#)* (A — x) a (A —a:)® -1 * 

a(a+l)A a(a+l) 


a 


aA 


ay 


(A -«) a + 2 (A—a ;) a+1 (A-a;)“ (A-a ;)*+ 2 (A-z )^ 1 (A-a^i^A-a)^ 1 * 

1 


j e ^-y + 1 0±R dX = _J C A A .- T+Ia<j 


(A — x) a+1 


T A aA“-r+i "If ae A 


(3) 

(4) 

(5) 


and the integral cancels the integrals arising from the last three terms of ( 4 ). Hence ( 1 ) 
is a solution of the differential equation provided that the 
integrated part of (5) vanishes, and this will be satisfied for 
any path such that 91(A) = — oo at the ends. Hence integrals 
on the paths M 1 and M 2 in the figure give separate solutions 
of the differential equation. 

On M 1 we must attend specially to the phase of A —a;. A* is 
taken real and positive on the positive real axis, and there¬ 
fore (A —a?)“, when A-a is real and positive, is also real and 
positive. If x tends to a real positive value, the contours 
being deformed so as never to overlap, we can get from A 
real and greater than 91(#) to a value near the origin only by turning through — tt about x\ 
hence on Jf. 

(A - x) a = (x - A) a e- ani . 

On M x we can now expand in descending powers of x; 

4 ■LffSp" -- (i 





*•) 





_ o 





x~ 


x-*e ia7r 2m 


; + 


a 


2nix~ a e ia7T / , 

” (y — a — 1 )! \ + 


((y-a-l)! + (y-a-2)!a; + (y-a-3)!2!a: 2 


a(a+ 1 ) 


; + 


) 


ctjy-cc- 1 ) a(a+l)(y-a-l)(y -q- 2 ) 


2! cc 2 


-)• 


( 6 ) 


JMP 


39 
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On M % we put A = x+k; then 


23*04 


/*—co 


e K {x + K)*~r djc 


04* 


K a 


rxj e x X a ~y 


'(* 00 e K K- a (l + 
J — oo, 0 + \ 


(cc-y)K (a—y) (a—y— l)/c 2 


= 2 me x x a -r( 


+ 


a; 

CL—y 


+ 


+ 2 ! a ? 2 

(a-y)(a-y-l) 


+ 


-) 


dK 


\(a— 1)! (a —2)! a; (a-3)!2!* 2 


+ 




2 7 rix*-re x (. (a-l)(a-y) (a-l)(g-2) («-y) (a-y-1) 


(a-1) 


f( ,+ 


+ 


2! a; 2 


+ 


4 


(7) 


If 9t(as) is large and positive any term of (7) is large compared with I x , and in Poincare’s 
sense I x can be neglected. With actual values of fR(a:), however, I x may be comparable with 
some terms of the I 2 series and is then worth retaining. Then for $(%) > 0 


i-^i( a > y > x ) 


(r-i)i,—. (a-i)(a-r) («-i)(g-2)( «-r)(g-y-i) 

(a-l)! M, + x + 2 U a 

(r—i)i . «*,A , a(r-«-i) g(g+i)(r-«-i)(r-«-2) 

(y —a— 1 )! \ x 2 !a ? 2 



If$(:r) < 0 the same holds provided that M z still lies above M x \ but if we draw M 2 , as would 
be more natural, so as to lie below M x the factor e ain must be replaced by e~ ain . This is 
another instance of the discontinuity of constants in asymptotic expansions. 

It follows that, again for £$(x) > 0 , 


x x ~y + a — y, 2 — y, a;) = J 2 + J X 


(i-r)? 

(cc-y)l 


X °c-y e x 



(oc-y) (a-l) 
x 



(i-y)t 

(-a)! 


g(l+a—yjwivg—a 



(1 + a—y)( —a) { j 


(9) 


and the two series in ( 8 ) and (9) are identical, but their coefficients are in different ratios. 
If 9ft(rc) is large, however, the portion arising from M x is negligible in comparison with that 
from lf 2 , and the two solutions are almost proportional. If 9l(a;) is small, and especially 
when x is purely imaginary, we must keep both series. (For $(z) < 0 , reverse i.) 

It follows that for varying x, J x and J 2 are constant multiples of I x and 7 2 , subject to the 
same expansions remaining valid. For all four functions are solutions of the differential 
equation, and I x and 1 2 are not proportional. Hence J x and J 2 can be linearly expressed in 
terms of I x and / 2 . But if J x = AI X + BI 2 , and B=£0,J x will increase like e x as $ft(a;)->co, 
and it does not. Hence B = 0 . Similarly by making $(;»)->-oo we show that J 2 is a 
constant multiple of / 2 . 

This result makes it possible to express I x , I 2 , J x> J 2 in terms of the series solutions 
u x , u 2 when these exist. For if we write 


(r-i)i 

2m 


\ 


e A A a ~y 
mM-*)* 


dX = S-, 


(T-l)!f e^* dX 
2m J M t (X~ x ) cl 


8 . 


2 > 


( 10 ) 


2m J , 


e AA*-i 


M 1 (^~ x ) a ~ y+1 


(i-r)L ,- 

2m 


j 

jM t 


e A A a_1 


(A—aj) a_ y +1 


dX - Jn 


2 > 


( 11 ) 






2304 

U(a, y, x) 
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we have 

M-y — Si + S% , 

( 12 ) 



(13) 

whence 


(14) 

We make use repeatedly of the identity 



sin cun = - - -- . 

(a-l)!(-a)! 

(15) 


Then the coefficient of S x on the left of (14) is 


n sm yir e 


—illTT 


Now if A — —[i 


( — a)! (a — y)!sina7r sin(y — cl)tt* 

•/; 


S i = («(*-y)> -1), 


(X— t/)i p-ft //“-I 

J i =-— ^sina^eta-rX^ («(a)> 0 ). 


7T 


and from the latter, 

s i = 


(y— 1)! e ia7r 


i x 'i, 


00 


We write 


(y-a-1)! (a- 1)! Jo («+/t) a_r+1 

(r-Di 


dfi (9t(a)>0). 


^ = e* 


TJ{<x,y,x)\ 


(7 —a— 1)! 

then for 0 < arg # < /r (replacing i by — i for — 7 r< arg ic ^ 0 ) 

p—iaw /* x>A^tx—y 

P(«.r.*) = -^(y-a-mj from (20), 


g<(y—a—l)*r 


27T4 




r e A A e “ 1 

Jjtf, ( A ~-a)«-r + » <tA fro™ (H), (13) and (20), 


= (a^yjljo W^ dfl W“~r)>- 1) from (17), 

<9i( “ )>o) fr ° m < i9 >’ 

( — 7)! (y —2)! 

= W 1 + ( a 2 x)i u 2 (y non-integral) from (14) and (16) 

~^(i + ga^) + ^± . ?)<y-«- 1 )(y-‘»- 2 > + ,..) from(6) . 


(16) 

(17) 

(18) 

(19) 

( 20 ) 

( 21 ) 

( 22 ) 

(23) 

(24) 

(25) 

(26) 


(26) is a terminating series ifa—lora — yisa negative integer, and is then an exact solu¬ 
tion. In either of these cases one of the series solutions in (25) is multiplied by zero, and 
U reduces to a multiple of the other. 

If y — a — 1 or -aisa negative integer, (21) or ( 22 ) is defined by continuity. 


39-2 
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A full discussion of the functions was first given by Barnes, who converted the series 
into complex integrals involving z\, where z is a complex variable of integration, the 
terms of the series being the residues of a function at the poles of a factorial. We shall 
also give another method due to Goldstein. 


23*05. Goldstein’s operational expression.* We now consider a different treat¬ 
ment, in which the operational form for a confluent hypergeometric function is obtained 
in terms of Bessel functions, and the result 23-04 (23) is again derived. In reducing the 
series to operational expressions we have introduced such a power of x that the operator 
takes an algebraic form, the y factors being removed from the denominators. We could, 
however, remove the a factors from the numerators. We first replace x by — a 2 /4f, and 
if p is now the operator corresponding to t we have 


pn 


t~ n 

F»>i’ 


pn-i-l — 


£—71—1 

(-»-!)! 


nt~n ~ 1 


Til ® 2 \ t ( — a)! a 2 ( — a)! (a 2 \ 2 

ii(a,y,--J- +(_ a _ 1 )j i\ y Tt + {-a,-2)\2\y{y+l)\UJ + 


( 1 ) 

(2) 


, . / 4 A a ( 1 [a 2 Y 1 /a 2 \ a+1 ) 

~ { a) v) U-flOlU) + (-a-l)!l!rW + "'| 

= (— a)! (y — 1 )! (|a) 1 _, >'i <x p a_ 1 /a >'+ 1 / 2 Z,_ 1 (a.p 1/2 ). 


(3) 


This is exact for all t with a positive real part. For if we interpret 7 y _i(op 1/2 ) by means of 
a complex integral on the path M it gives a factor of order exp ( az 1/a ), which is over¬ 
whelmed by the factor exp (tz) when t has a positive real part and z->—oo. Then using 
23-03 (3) and (10), we find 


i^( a ’ r, s) = « aV Vi(r-‘*>r. -5) 

_ e o a /«^ a —y)! (y — 1)! (|a) 1-r £?- a p 1 / 2 y-a+ 1 / 2 / r _ 1 (ap 1/a ). (4) 

Similarly, using 23-03(11), 

(ii) ^ lFl — 2 ~7>^) = e° 8 / "(a-1)! (1 -y)! (|a) 1 -^- a p 1 /ay- a + 1 / 2 / 1 _ y (ap 1 ^). (5) 

The constant factors are in the same ratio as in 23-04 (14). Also if a is large I y _ x {a£b) 
and I x _ y {az r!i ) are nearly equal, and a suitable solution with a different behaviour at 
infinity will be 

U = 7re a2 / 4< (ia) 1 -yir-«p 1 /2y-a+i/ 2 iQ ll _ y (ap 1 /2). (6) 

To investigate this we use the expression 

1 f 00 / x 2 \ 

Kh n (ic) = - exp I — u - — 1 du | arg x | < \tt. (7) 


* Proc. Lond. Math. Soc. (2) 34, 1931, 103-25. 
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If we interpret ( 6 ) by using the path L, U converges since 

Kh n (z) = 0 (z - 1 / 2 e- 2, ) > 

and on this path the relation dt(a 2 z) > 0 is satisfied. Then 


613 


But 

and 

Hence (9) is 


Khi_ y (ap^) = t(|apV 2 )y-ij^ exp {^-u-u~vdu 

= \ (iap 11 *)?- 1 ^ e-vu-W ^ du 

1 f 00 

= - e-*vrvdu y 

n J a 1 lit 

irp 1 l*r-*+V*Kh 1 _ Y {ap 1 l*) = (i a )y-i_py-a f” e -u u -7 du. 

J a*/it 

\. F {p)9{t) = f f(t—r)g(r)dr 
V Jo 


pr~ a+1 H(t) = 


t*-7-1 


(Sft(a —y) > — 1). 


(a-y- 1 )! 
Jo(a-y-l)! J o*/4r 


( 8 ) 

( 9 ) 

( 10 ) 

( 11 ) 

( 12 ) 


The integration is over the shaded region in the diagram. Reversing the order we have 
for the limits a a / 4m< r<t, a*/4t <u< oo; and we have 


(\a)y- x f e-^u-vdu f 
Ja*/« J a* 


(t — 7 )a—y-i 
'/ 4 m (a—y— 1 )! 


dr 


\a-y 

I du 


r e~^u~y it — — 

(a-y)!j a */« \ 

(4a)r-l /*« / a 2\a-y 



Hence 


■ < i3 > 

!7 = (^r)!r e "’^( t ’ + s)'* <i *’ <«(*-r)>-i) (H) 

is the solution required. Also, using the definition of Kh n in terms of I_ n and 7 n , we have 
U(oc, y, x) = f—^yj J Q e~*v*-r(v+x)-*dv ( 15 ) 


= — 7T cosec y7T 


L 


l(a-l)!(l-7)!“ 2 (a-y)!(y-i)t“» 

-1 ( ~ y)l ti. i <y- 2)i , 1 


J 


(16) 


The numerical factors can be checked by taking x = 0. This result is identical with 
23-04(25). 
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Now 

and therefore also 


Solution small at — oo 
Kh_Jz) = Kh n (z) 


2305 

(17) 


U { a> 7 , Zt) = ^ e02/ ^(^) 1 ~ YiY_ V /ay_a+1/2Kh y-l( a P 1/2 ) 


_ i _ WT'sm r (18) 

(a-l)!W J a*/ 4 t \ to) 


by similar methods; then 

Z7(a,y,a;) = 


(a-1)! 

1 


x x ~ye x J e~ u uy~ a ~ 1 (u — x)*' 1 du 


/»QO 


(a-1)! Jo 


2*1-7 e _r ^ a_1 (v + a;) Y ~ a_1 dv (91(a) >0). 


(19) 


The two forms (15), (19) were found in 23*04 by using the two loop integrals l x and J x . 

Since 23*04 (21) (22) are analytic functions of a and y for all values, they can be 
taken as providing definitions of U for unrestricted cl and y; and then (16), being true 
for a continuous range of values of a and y, will be true for all a, y, the right side being 
defined by continuity when y is integral. They become ambiguous if a; is real and negative, 
so that a cut along the negative real axis is needed for x, but a similar cut is needed to 
define x 1 ^ if y is not an integer, so that this involves no loss of generality. 

The integrals are inconvenient for finding the convergent series expansions directly; 
for if we try to expand a power of u — x in ascending powers of x the series will diverge for 
| tt |<|s| and it is impossible to find a path passing between the singularities such that 
the series converges at all points of it. 

If we take the loop M 2 for either integral we find without much trouble that it is a 
multiple of e x U(y-a,y,z), where z = xe ±iir . Owing to the restriction |argz| <tt (cf. 
23*04 (21)) we take the lower sign when %{x) > 0 and the upper sign when 3(z) < 0. Tor 
$(x) > 0 we make M 2 lie entirely above the real axis and find a solution 


V (cl, y, x) = 


(«-!)! 
2 ni 


2ni 
r^>e x x?~v, 


/*— oo 

J — oo; 

j"—00 


e K (x + K) a ~y K~ a die 


o+ 


e K {x + ac)* -1 /C? - ® -1 dK 


and 


(y— i)i (y- 1 )! 

= i F x {a,y,x) = ^_ l y F + ( r _a-l)l 


e* in TJ Q{x) > 0), 


u 2 = a 1 ~’ , 1 T 1 ( 1 + a—y, 2—y, #) = V +e (1+a- ^ )i,r U (8(*)>0) 


( 20 ) 

( 21 ) 

( 22 ) 


whence 


7T I e ai ” e (*-V>i* \ 

7 < a ' X) - \(y — a— 1 )! (I—y )!“ 2 + (-«*)! ( 7 - 1 ) 1 “V' ( ’ 

For $(cc) < 0 we make M z lie entirely below the real axis; the result is a different solution 
W(ot, y, x), which is equal to the expression on the right of (23) with i replaced by —i. For 
$(x) <0,W 6 xx»-y and the analytic continuation of V (a, y, x) is 

Trr/ x 2mU{a,y,x) 

V(«.y.»)- (y _ a _ l)l( - _^ i- 
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The interesting property of F, apart from its simple asymptotic expansion, is that it 
tends exponentially to 0 as —oo. Further, V as determined by either of the in¬ 

tegrals (20) is a solution of the differential equation, independent of U, even when y is an 
integer and the two series solutions coalesce. 

We have also seen from 23-03(10) and (11) that one solution is expressible in finite 
terms if either y — a or 1 — a is an integer. In particular 

i F i( a > a > x ) = i*i(0, y,x) — 1, jl?i( -l,y,*) = 1- */y. 


23*051. Convergent expansion of U(oc, y, x) when 7 is a positive integer. Put 

7 = rn + c, where m is a positive integer, and let c tend to zero. 

V(cc, m + c,x) = ^(a.m+c.x) 

{m + c — 2)! , 

+ ~( g ~ -1 ) ~ X 1 + a- m-c,2-m-c,a>). 

When c->0 the terms up to x m ~ 2 in the second series give negative powers of * and have 
no counterparts in the first series. The corresponding terms in U tend to 


where G(x) consists of the expansion of ^(1 + a-m, 2-m,x) up to the term in x m ~ 2 . 
Next, take the term in af in the first series with that in * r + w *- 1 in the second. With then- 
proper multipliers they give 


, _ (-m-c)! g(a+l)... (a + r-1) 

(a—m—c)!r! (m + c)... (m + c+r — 1) 

(m + c- 2 )! (1 + a-m-c)... (r + a-l-c) 
(a—1)! (r + m — 1)! (2 — m — c) ... (r — c)^ 

__ (-m-c)! (a + r-1)! (m + c- 1)! 

(a —m—c)! (a — l)!r! (m + c + r—1)!^ 


(m + c —2)! (r + a-l-c)! (l-m-c)i 
(a—1)! (a —m —c)! (r + m —1)! (r —c)!^ 


n _ 1 _ ( (a + r- 1)! ^ (a + r— 1 —c>! 

sin (m + c) tt (a — 1)! (a — m — c)! \(m + r — 1 + c)! rs (m+r-1)! 

->(-!)* 


1 & _ 

(a— 1)! (a —m)! 

f (a + r-l)! , (a + r-1)! 


L ^a-t-r — iji ia + r—1)! - -1 

. , (from 15*04) 

(-1) TO a(a+l)... (a + r-1) _ ' 

(m-l)!(a-m)!r!m(m + l)...(m + r-l) af[l0ga? “^ m+r “ 1) ~^ r ) +/r ( a+r “ 1 )]* 
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But 


23*06 


Whittaker's farm 

—F(m+r—l)—F(r)+F{a+r—l) 

v 11 1 

= — Flm — 1) — F(0) + F(ct — 1)--j-...--- 7 

v ' ' ' ' / m m + 1 ra + r— 1 

_1_1 1 1 _J_ 1 

1 2'“ r *^"a"*~a + 1 **’ a + r—1* 

oo /_l \m N _ 

Hence Z7 2 = 2 w r = W ’ ^ log *~ F ^ m “ 1 )~^(°) + ~ + U * 

where 

r/ = v (-l) ro g(a + l)...(a+r-l) ^ 

8 ,-i(w —l)!(a—w)!r!m(m + l)... (m + r— 1) 

/I 1 1 _1_ _1_1_1_ 1 \ 

X \<x + oc+ l“* + a + r— 1 1 r m m+1 *’* m + r— 1/ 

and H (a, m, x) = + £7 2 . 

This solution is due to Stoneley.* A function partly tabulated by H. A. Webb and 
J. R. Aireyf omits the terms U v 
If m= 1, the terms U 1 do not arise. 

If a is a positive integer less than m, U reduces to U x . In this case, and also if ct is 
a negative or zero integer, U is given exactly by 23-04 (26). 

If y is 0 or a negative integer, a convergent expression for U may be obtained by 
similar methods, but the chief practical case is that discussed directly in 23-07. 


23*06. Whittaker’s transformation. If in the original differential equation 




xu" + {y—x)u' —cm = 0 

(1) 

we put 


u = ve^ x 

(2) 

we get 

v” 


(3) 

and the further substitution 

v = x~ Xi%y w 

(4) 

gives 

to’ + l 


(5) 

Putting 


w = x x, *y 

(6) 


we have the further form 


x~ x Lf)-{ix’-a r - a )x+Kr-my = 0 . ( 7 ) 

(7) is in teresting because if a = \y it reduces to a form of Bessel’s equation, and solutions 
are I ±My ^{\x). Then ^ (a>2a ^ = (a _| )!e I a -i h (\x). ( 8 ) 


When the term in a; is not zero, however, it can produce a profound change in the cha¬ 
racter of the solutions. Evidently Khy^^x) will be a further solution of (7) if a = \y 
and corresponds to the solution TJ\ but if this condition is not satisfied it is possible to 
find a and y such that U is a multiple of one of the series solutions. The Bessel analogue 
would be that TCb u could be proportional to either I n or I_ n \. 

* M.N.R.A.S. Geophys. Suppl. 3, 1934, 226-8. See also D. R. Hartree, Proc. Camb. Phil. Soc. 24, 
1928, 426—37 for the corresponding expansion of Whittaker’s function, 
f Phil. Mag. (6), 36, 1918, 129-41. 




23*06 Whittaker's form 

(5) is Whittaker’s form, and is written by him as 

* l i k £ — m 2 \ 

w+ {- i+ - x + hr) w = ° 

so that k = \y — CL, m — ^(1 — y). 


617 


( 9 ) 
( 10 ) 

This notation simplifies the writing of the differential equation slightly. But if we try to 
solve (3), (5) or (7) for general y, a by a series we get a three term relation between the 
coefficients, and it appears that in any useful series a and y will enter explicitly into the 
solutions. We shall therefore write solutions of (6) as 

e- ll * x x ll *y{ 1 F 1 (ot,y,x); x x ~y ^(l + a-y^-y , x ); U(a,y,x}. (11) 

while those of (7) need an extra factor x~^ and therefore, unless y = 1, one of the series 
solutions tends to infinity at the origin. 

An alternative form comes by putting x = 2/tz; then 

^ {-„» ^ a. r(2-r)l 

dz 2 




( 12 ) 


In many applications fi is + 1 or + i. 

A case of special interest is where a solution of (5) is required to tend to zero at x = + oo, 
and also to be small compared with x 1 ^ for x small. The former condition requires that the 
solution shall be that in U. But U is-a linear combination of two solutions of (1), of 
which one is bounded and not zero near x = 0, and the other behaves like x x ~y, except 
for y= 1, when the second solution behaves like log x. The corresponding solutions of 
(5) will behave like xfr and x x ~iy or a;* log a;. Thus if y > 1 only the solution of (1) 
bounded near the origin is admissible, and if y < 1 only the one that behaves like x x ~y. 
In the former case the solution required is ^(a, y, x). But this increases like e* for 
large x unless a — 1 is a negative integer, when we take the series to end with the term 
in xr a . In the latter case the solution is x x ~y 1 i'i( 1 + a — y, 2—y, x), which increases like e* 
unless a—y is a negative integer, when we take the series to end with the term in a^ -< * _1 . 
In either case the admissible solution of (5) is a terminating series multiplied by e~ ix . 
The function U is therefore of great physical importance. Whittaker’s solution takes 

the form x-m+'he-^* f 

w h , m {x) = 

— g-Vi^Vay U(ot,y,x)> (13) 

CL = \—m — k , y=l — 2m, a — y = m — k — (14) 

We shall write W kfin (x) = W(a,y,x) (15) 

when the a, y notation is being used. The differential equation is unaltered if x and k are 
replaced by — x and — k; hence another solution is W_ k>m { — x). The asymptotic expansion 

°fHW*)is 

m 2 —(k — \) 2 {m 2 -{k-\) 2 }{m 2 -{k- f) 2 } 


W kt Jx) 


ajfcg-VsxJ i 


+ 


-i 


1! x 21 x z 

which terminates if m + k — \ is an integer ^Oorifm—fc+ £ is an integer < 0. Also 

{k + \) 2 }{m 2 -{k + f) 2 } \ 


*£*«(-*) M 


— 1 — 




(fc + *)2 + {ma 


llx 2 lx 2 

which terminates if m 4- k + \ is an integer < 0 or m — k — Jan integer ^ 0. 


(16) 


( 17 ) 
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23*07. Schrddinger's equation for the hydrogen-like atom. The radial wave 
function R satisfies a differential equation reducible to the form 


d z R 2 dR ( 
dr 2 r dr \ 




)jj = 0. 


( 1 ) 


where l is a positive integer, and Z is positive. R is required to be bounded for 0 < r <oo. 
If we put 

R = PJr (2) 


we get 


d z P / 
dr z \ 


2 E + ^Jl+V 


jp = 0. 


(3) 


If E is negative (bound electron) we put 2 E = — J/c 2 , Kr = p, and then fin d 


d z P 
dp 2 


which is in Whittaker’s form with 


y — —21, a — — l — 2 Z/k. (5) 

The solutions have indices l+l and — l at p = 0. The latter is excluded, and the solution 
required is 

P = e^Ppf+ 1 1 F 1 (l + l-2Z/ Kt 21 + 2,p). (6) 

When p is large this will be large like exp (|/>) unless the series terminates (compare 
23*04 (9)), that is, unless 1 + 1 — 2Z/k is zero or a negative integer. Hence 

k = 2Z/(l + s + 1) = 2Z/n (7) 

where 8 is an integer ^ 0. n is called the principal quantum number. Then 


1 Z 2 

E = -\k z = (» = Z+l,Z + 2 ,...), (8) 

R oc pfe-Vw ^(-8,21 + 2^). (9) 

The polynomials in this case have a compact operational expression. From 23*03 (4), 
if y is an integer > 1, 

xV-'iF^-StytX) = (y-l^l-IJ^l-y = fy-l)!^-!) 8 ^ 1 " 7 ' 8 . (1 0 ) 

But P(p-a)l = e ax F(p)e~ ax (11) 

and therefore 

&- 1 ! Pi(-g,y,«?) = (y-l)!e x ^ + ^ y+8 -- l e- a! 


= (y-l)\e x 


P 


,S+1 


(p + l)r+‘ 


= (7-1) 




8 XV + 8 ~'* 

{y+8 — i)! ( 


( 12 ) 


The polynomials 


L,{x) = (x’e-^) 


(13) 




23*07 Free electrons 

are called the Laguerre polynomials * Now 
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Writing 


/ JJ \ 

piQ-'hp 1 F 1 ( — 8,2l + 2,p) = pte-'hPp-®- 1 (2l+\)\e p ^ j 


d\* p 


2Z+S+1 


(2 1+8 + 1 )! 


e-P 


— (2l + l)\p- 4 ~ 1 e 1,iP 




(21+8 + 1)! 


e~P. 


L s (p) = (D-l)°p° 


and (14) maybe written (2Z + 1)! 


-p-I-lg— l /ap(2)— l) a yO a + 3 +l. 


(14) 

(15) 

(16) 


(2l + s+ 1)! 

If denotes the (2Z + l)th derivative of the (21 + 8+ l)th Laguerre polynomial 

^li+s+1 = D* +1 (D - I)2^ a +ip2/ +a +i. (17) 

Comparison of coefficients shows that (16) is equal to 




(18) 


If E is positive (free electron) we have to replace k by Ik\ but for large r the exponential 
factor will now be exp (iicr) and will not tend to infinity for r large and real. The differential 
equation now reads, with E — £/c 2 , Iky = p, 


2 iZ 

l(l+l)\ 

Kp 

P 2 / 


The solution can now be written 


P = e -i/aiw( Kr )/+i i P i+ 21 + 2, iicrj. 


(19) 

( 20 ) 


Here a is complex and iter is purely imaginary. Hence in the asymptotic expansion for 
large r we must keep both .the series in 23*04(8). We have 

P ~ [ gga (21) 


and iicr must be interpreted as at exp (ini). Then 

p ^ ( 21 4-1)! er irzix ( (kt) 2iZIk e i /aiicr _i h„iq + i) , ( Kr ) 2tZlK q— Vaiw+Vawiff+l)! 

1 ' \(l + 2iZ/K)\ ^(l-2iz(K)\ \ 


( 22 ) 


2(2l+l)\e~ nZlK 


cos 


\(l + 2iZlK)\ 

in which the constant factor is independent of r.f 
(20) can be put in another form. It is equivalent to 


[iKr - \i t(1 + 1 ) + ^ log (kt) - arg (l + 2 iZ/ic) ! j , (23) 


e -i/ai*r( Kr )M-l 


(2Z+1)! 






2m J M (A — iicr) 2izlK+l+1 


dX. 


(24) 


* Cf. Courant-Hilbert, Methoden der mathematischen Physik, 1,1924, 77-9; E. Schr6dinger, Ann. d. 
Physik, (4), 80, 1920, 437-90. 

t Bethe, Handb. d. Physik , 24/1, p. 289. His k is the present £/c. 
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Now (19) is unaltered by writing — 1—1 for l. This suggests considering the expression 


e -%iKr( Kr )-tW_ tll ! j e A x* iz l«+i(A-iKr)- 2iZ l*+*dA. (25) 

Expanding the integrand in descending powers of A and integrating, we find that (25) 
is equal to 




/2iZ + l]j 

-W"( Kr y+i - -L_ 1 _P 1 (l + ! + M ; 2Z + 2,iCT) 


(26) 


so that (20) differs from (25) by a constant real factor. It should be noted that the solution 
(20) is real. 


23*08. The parabolic cylinder, Hermite, and Hh functions, 
equation 


^+{A-Bx 2 )y = 0 


Consider the 


( 1 ) 


where A and B are real. Clearly all solutions are integral functions of x. Put £ = x 2 , 


y = £ _1/ *z. Then 

d 2 z / A Z \ 
d|2 + (“^ + 4| + l6|2/ 2 = 0 

which is in the extended Whittaker form 23-06 (12), with 

^ = 7 = 1. (3) 

and solutions of (1) are 

exp (— \ B'lsx 2 ) {&(oc, l B*x 2 ); x x F^ + a, f, B^x 2 ); U(a, B^x 2 )} (4) 

with three others obtained from these by changing the sign of \JB. Naturally not more 
than two of the six solutions can be independent, and in particular if we take the positive 
sign for ^JB and put 

a ' = i( 1+ ^)’ (8) 

exp(-i B'l*x 2 ) ^(a, £, B^x 2 ) = exp(| B^x 2 ) 1 F 1 {<x f } |, - BV), (6) 


« exp (- 1 BV*x 2 ) ^(i + a, f , B^x 2 ) = x exp (£ B^x 2 ) X F X {\ + a', f, - B^x 2 ). (7) 

For the expressions in (6) are even solutions, and those in (7) are odd solutions; and the 
ratio of the two sides of each equation tends to 1 when x tends to 0. The same follows 
from 23-03(10). 

Three specially important cases are distinguished according to the signs of A and B. 
First, the equations 18-04(7), (8) satisfied by the parabolic cylinder functions are of the 
form (1). For their solutions to be oscillatory for large £j or large £ 2 , as defined in 18-04, 
we must have k 2 >ju, 2 , and therefore B is negative. This case, in tidal theory, would corre¬ 
spond to a prescribed harmonic motion at a long distance from a parabolic cape, or at the 
mouth of a long parabolic bay. Hence in this case the parabolic cylinder functions are 
expressible in terms of confluent hypergeometric functions of imaginary argument. 
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Hh n functions 621 

With A real, a will be complex; but the solutions will be real as in the case of the wave 
equation for free electrons near a nucleus. The same differential equation arises in wave 
mechanics for a linear repulsive field. 

B is positive in the case of the harmonic oscillator in wave mechanics, and in tidal theory 
or acoustics in the case of a local disturbance due to a parabolic projection. In the latter 
type of problem we have as a limiting case diffraction by a semi-infinite screen, which is 
one of the few diffraction problems that have been worked out exactly.* If we require a 
solution to tend to zero when x tends to infinity in each direction, no linear combination 
of the odd and even solutions can satisfy the conditions unless one of the coefficients 
vanishes, and then the solution with a non-zero coefficient must reduce to a polynomial 
multiplied by an exponential. Hence 2a is an integer < 0. This is the case of the 
Schrodinger wave equation of the harmonic oscillator. 

One solution also reduces to an elementary function if 2a is a positive integer. For 
then, since a + a' = one of the series on the right of (6), (7) reduces to a polynomial. 
But in this case, since it is multiplied by exp (^B^x 2 ), the elementary solution 
will usually be forbidden. We can however still find a solution based on U that tends 
to zero as x -» + oo. This will not be an elementary function, but the functions that occur 
in problems of heat flow in one and three dimensions, and the Hh ft functions for 0, 
which have importance in statistics, are of this type.* 

These solutions are conveniently derived by using the function U and its operational 
expression. Since 



p 1/2 exp ( - 


(8) 

and 

= Q 

i/a 

| e~ x from 21-08 (13), 

(9) 


/ O \ 1/2 

p»+ 1 /4Khi/ a (ap 1/ 2) = pn+Vi e -apv* 




/ 2 V/a 

= 1 — 1 pn e -ap\ 

(10) 

But 

0 

_apVa _ __ tfhg-ap'h 

da * * 

(11) 


/•oo 

I e-° plh da = p- 1 ke~ apv % 

(12) 

and therefore if 2m is an integer ^ 0, 

pm+^Q-apVi _ (. 

and if 2m is an integer < 0, 

(d\ 2m 1 

- 1^ 2 H 1 P -a*M 

1 \da) V(^) ’ 

(13) 


pm+y2 e -apV* — | 

(J» 

(14) 

Now 

u fa, = 7re° # /"(Ia) 1 ^2-«^»/4-«Khi /a (a^ 1 /a) fr om 23-05 (6), 



= TrVagoV^Va— 1 *p 

l l 2 —oiQ—ap 1 l*' 

(15) 


* Lamb, Hydrodynamics, 1932, p. 538. 
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Hh n functions 

23*081 

Put a = 

— m\ then, if 2a is an integer < 0, 






l 3 \~ 2a 

_ £-a e a*/4/( _ 1 )2a i _ j e ~aVU f 

(16) 


/ 3 \- 2 * 

U(a, i B^x*) = (- i)2«(25 1 /4) 2 «e B1/aa:2 1^1 

(17) 


These solutions are known as the Hermite polynomials ; and their products by 
exp (— l x 2 ) satisfy (1). For a given a, where 2a is an integer ^ 0, (17) is the only 
solution that tends to zero at either x = oo or x = — oo; it does so at both. If 2a is not 
an integer ^ 0 there is no solution that tends to zero at both x = oo and x = — oo. 

If 2a is an integer > 0, 

<18) 

U oo \2<x 

dxj e-™*'. (19) 

Thus the solutions can be built up by successive integration or differentiation according 
to the sign of a. The forms (16), (18) are convenient as they stand in problems of diffusion 
(including heat conduction). For other purposes it is probably best to make use of the 
tables given in the British Association Tables , vol. 1; though a supplementary table of 
the commoner functions at closer intervals and to four or five figures would be very useful. 

C°° C°°. 

A table of the related functions ierf x = I (1 — erf w) dw, iierf x = ierf x dw to four 

J X J X 

figures is given by Hartree* (cf. 20*06(9)). 


23*081. For n an integer ^ 0, 



(i) 

and for n a negative integer 

W / J \ _ ji _1 


Hh»(*)-(-l)*- 1 ( s ) er**. 

(2) 

In particular 


Hh 0 (a;)=J e~ lku 'du, Hh 0 (-oo) = ^(2^), 

(3) 

Hh_ x (a;) = e~^\ 

Clearly for all n 

-SDiJx) = -HVjO*) 

(4) 

(5) 

and it is easy to show that Hh a (x) satisfies the equation 


§+*! 

(6) 


* Mem. Proe. Manchester Lit. Phil. Soc. 80, 1936, 86—102. 
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The functions satisfy the recurrence relation 

(n +1) Hh n+1 (z) + ocBh n {x) - Hh n _ 1 (a:) = 0. (7) 

If we put y = e- 1 / 4 X *z, ( 8 ) 

we find ^-(» + £ + \x 2 )z = 0 ( 9 ) 

for all integral n, positive or negative. This is of the form of 23*08 ( 1 ), which is reduced 
to it by taking 

x = (4B)-^7]. (10) 

Then the solution of 23*08 ( 1 ) that concerns us, in the cases that we have investigated, can 
be written compactly as 

e'M’H \{ V ) (11) 


for any integral n, positive or negative, and 

A = —(2 n +1 )<JB. 

The function D n (x) is defined by 

D n {x) = e 1 / 4 a;, Hh_ n _ 1 (fl;) = e ^(- l)*-^e-^ 


( 12 ) 

(13) 


for n positive. It clearly has n real zeros. It also has an orthogonal property, as we should 
expect because it arises in the solution of the wave equation. In fact 

If m =j= w, we can take m to be the greater. Then the function differentiated m times is a 
polynomial of degree n, and the derivative is zero. Hence the integral is zero unless 
m = n. If m — n the term of highest degree is (— l) n z n , and 


f {D n (x)} 2 dx = f n\e~ lhxt dx — *J(2n)n\. 
J —00 J — 00 


(15) 


The Hh n functions are all positive when n > 0 and there is no question of orthogonality. 
In terms of the series solutions of ( 6 ) 

co 2 1 /? Tn r Tn 

Hh n (») = J(ln) 2- 1 ^ n 2 ( — l) m —7777 - 

nV 1 ’ »n—o ' m\{\{n-m)}\ 


__ V(|tt) (, , nx 2 | 
2 1/2n (i?i)!\ 2! 

-Vrt’ 7 )— 


n(n— 2 ) a ; 4 


4! 


+ 


-) 


(n— l)x 2 (n— l)(n — 3)x* 


2 V a »-V a (^_|) 


3! 


+ 


51 


+ 


4 


(16) 






624 Asymptotic forms 23*082—23*084 

23*082. Asymptotic approximation to Hh n (x) for x large, n>0. From 23 081 (1) 

p— 1 / 235 * r°° 

B.h n (x) = ~^r\ o tne-te-Wdt (1) 

P- 1 / 2 ** r°° f 2m+n 

= VJ o e " teS( - 1) ” i 2^!^ (2) 

(3) 


e- 1 ^ 8 • ( _ ir (2m + n)! 


»! TO ^ 0 V *' 2 m m! a£ m+n+1 * 

Hh n (-*) + (-l)»Hh 1 ,(*) = iJ^ 


pi( e -ih(t-x) a 1 )« Q-Vsit+z) 2 ) dt 

= i- f°° t n e-™~ x *dt 
n\J - * 

= —7 1 (aj + w) n e _1/2 “ 2 dtt 

J-oo 

= y , 2WD(im-i)! 

„“o w! (w-m)! 


(4) 


summation being over even values of m < w. The latter expression is exact but is given with 
the asymptotic expansion because it is in descending powers. It is useful in estimating 
Hh n ( — x) for large x, since Hh n (») is then small. 

23*083. Asymptotic approximation to Hh w {x) for n large, >0, a; moderate. 

Hh n (a?) is very small when x is more than about 3, but n may be large. Then if 


= nlogt — tx — |f 2 , 

7b 

$’(*) = --X-t, 

= 1 . 


(5) 

( 6 ) 

(7) 


Then for n large and x moderate, $'{t) = 0 gives t = - x as a first approximation, 

<P'\t) = -2. Then 

<f>(n 112 ) = \n log n — n 1,2 x — \n + |cc 2 (8) 


and 


p—1/435* 

Hhjx) ~ ^7T rdkriQ-T-hn-xVn 


(9) 


Taking a; = 0 and applying Stirling’s formula we recover the first term of the ascending 
series to order l/n; the approximation is therefore checked. 

23*084. Asymptotic approximation to D n (x) for large n. This is most easily 
found from the differential equation 23*081 (9) with — n— 1 in place of n, which is 


~+(n + i~ix‘)z = 0 . 
n + % = m, x = 2 


(1) 


Put 


( 2 ) 
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Accuracy of steepest descents approximations 
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Then 

d 2 z 

^|2 + 4m 2 (l-^ 2 )2 = 0 . 

(3) 

The solutions 

are of exponential type for £ > 1 or £ < — 1 and oscillating for £ 2 < 1 ; and 


2 mJ ^/(l — £ 2 ) = m(| 7 r — 6 — \ sin 26) (£ = sin 6), 

(4) 


2m V (£ 2 -1 )d£ = m(£sinh 2 w — u) (£ = coshst). 

(5) 


Then a solution decreasing exponentially as £-> +oo is 


(| 2 _ ! )V 4 ex P{- w ^ 2 - l Y h }{£ +M 2 - _g 2) V 4 ~ ~ £ sin 2 fl) + frr}. (6) 


The constant factor is determined from the fact that when x is large D n (x) is asymptotic¬ 
ally x n exp (— £r 2 ); it is thus found to be 

2~ 1 l 2 m 1 ^ n e~ l l 2Tn = 2- 1 l‘ 2 n 1 fo n e~ 1 h n . (7) 

A check is obtained by taking x = 0 and therefore 6 0. Evidently D n ( 0 ) = 0 for odd n; 
for even n the right side of (C) is + 2 . But for even n 


± A»(°) 


d n (by 2 ) 1 '™ 
dx n (^n )! 


(\y kn n\ 

(i n )- 


r*J *J2 w 1/an e -1/an . 


( 8 ) 


Hence to obtain an approximation to D n (x) with an error 0(1 fn) we must multiply both 
sides of ( 6 ) by 

2 _1/a (w/e) 1/an . (9) 


23*09. Accuracy of steepest descents approximations. This approximation to 
Hh n (:r), when n is large, suggests a way of estimating the error in stopping at the smallest 
term in an asymptotic expansion found by steepest descents. In the expression 

I = f°° e-^ aizi f(z)dz (1) 

J —CO 

put/(z) +/(— 2 ) = g(z); then g(z) is an even function, and 

I = J e^ a ' zi g(z) dz. (2) 

Integrate 2 n times by parts; then, since odd f m \z) vanish at z = 0 , 

/ = ^ S r(0)Hh <) (0)+?^Hli a (0) + ... + ^>Hli to (0) + B a „ (3) 

where R 2n = Hh^fos) 0 <*” + 1 > ( 2 ) dz = —Hh 2 n+ 1 (az) 0 «»+»(z) dz. (4) 

Now let the singidarity of smallest modulus of f(z) be at re ia ; then g(z) has singularities 
at + re ia , near which we suppose that g(z) behaves Hke a negative power or a logarithm. 
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626 Estimate of remainder 

The contributions to g(z) from these will have the forms 


23*09 


; + 


(re ia - z) m (re ia + z) m 


or 


A log (r 2 e 2ia — z z ) 



where m is independent of a and n. We assume that a is large enough for the smallest 
term to be at a large value of n ; now 


f* n \z) = 


(m + 2n — 1)! ^ / 1 1 \ 

(m — 1)! \(re*“ — z) m+2n (re ia + z) m+2n j 


(2n)\M h 2na 

~ ip2ng2nia, re? 01 


for z small, 


(6) 


in the sense that M can be chosen so that the ratio of (f 2n \z) to an expression of this form 
tends to 1 when n is large. Also if u 2n is the general term of (3) 


u 2 n +2 ^ 2n+2) (0) Hh 2w+2 (0) ^ 2_ (2 ro+l)(2 n + 2) 1 2 n 

u 2n * a 2 f 2n) (0 ) Hh 2n (0) * a 2 r 2 e 2ict 2(»+l) * a 2 r 2 e 2ict ' 


The smallest term is therefore specified by 

2 n = [r 2 a 2 ], 

a t> . u in {2n+l){2n + 2) f®, ,2?iz 

and " r 2 aHh 2 ~ (0)e 2 ^ g J 0 COsh ^> 


= u Zn a{2n) lh e ~ 2ia J exp (- <j(2n) az ~%a 2 z 2 ) cosh (V(2w) aze-*“) <Zz. 


(7) 

( 8 ) 

(9) 


If n is large and a not small, the integrand becomes small before the term in a 2 z 2 becomes 
important, and R in reduces to* 

^00 

R z n 4 s u 2 n e~ 2ia j e~ x cosh (xe~* a ) dx 


^2/" 2i ’ a 


( 10 ) 


u 

« 2 ft + ^ 2 n 4= pg = (i- i* cot a) M 2n . (11) 

The simple rule found for the incomplete factorial function that we should take half the 
smallest term is therefore true for steepest descents approximations only if a = ± \n, 
and this will be shown by successive terms being precisely opposite in phase. R 2n will 
exceed u Zn in modulus if (cot a [ > a/3. 

This discussion is rough, but serves two purposes. Most discussions of the error in 
stopping at a given term of an asymptotic approximation treat special functions and take 
the general term. But the method has a wide generality and should be capable of a more 
general treatment. We see that the fact that the early terms give a rapid decrease in the 
error is due to the rapid decrease of the early Hh n (az) with increasing n or az when a is 
large; the early derivatives of g(z) can be treated as approximately constant in the range 


* See Airey, Phil. Mag. (7), 24, 1937, 526. 
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where the Hh n factor is appreciable. But when we approximate many times by integration 
by parts, though the Hh n factor decreases, the f 2n \z) factor varies more and more rapidly 
with z until when 2 n = r 2 a 2 its variation becomes as important as that of Hh 2 n (az), 
and nothing is to be gained by further attempts to approximate on these lines. This 
explanation is probably familiar to many pure mathematicians, but possibly they have 
not succeeded in stating it with the precision that they like; but a physicist likes to see 
even a rough discussion that brings out the point. 

The approximate formula ( 11 ) makes it possible to allow for the remainder by simple 
inspection of the terms near the smallest, a being determined directly by comparison of the 
phases. It breaks down if a = 0 , that is, if all the terms have the same sign. This might be 
expected, since it means that the path of integration passes through a singularity. 
Difficulties have been found, for instance, with the estimation of the remainder in I n (x) 
with x real; but the path of steepest descent passes twice through the subsidiary saddle- 
point at — 1 , and the integrand is infinite there. The situation is saved to some extent by the 
fact that the improper integral exists, but it is necessary to break the range of integration 
up if integration by parts is to be used. Our approximation (9) is extremely crude in this 
case, but if we put a = 0 we get 

R 2n + £ w 2n( 2w ) 1/3 J* o ex P ( ~ i% 2 ) dx = \ ^j{mT) U 2n (12) 

and the factor is a warning that the size of the smallest term is no safe guide to the 
accuracy in such a case. 

Similar considerations are applicable to the integral 

1 = r e -^f(z)dz = 2 Fe-«m*KdZ. (13) 

Jo Jo 

The same methods will apply except that, £/(£ 2 ) being an odd function, it will be the terms 
in even derivatives that vanish. ( 11 ) will still hold with 2n—l for 2 n. 


1. Prove the recurrence relations 


EXAMPLES 


x ^(a +l,y+l,a?) = yliF^cc +1, y, x) - ^(a, y, *)], 
a i( a + 1,7+1,*) = (cc-y) 1 F 1 (a,y+l,x) + y 1 F 1 (a,y,x), 

(a + x) X F x (a + 1 , 7 + 1 ,*) = (a.-y) 1 F i {a,,y+l,x)+y l F 1 (a,+ l,y,x), 

a 7 + 1 ,7, *) = 7 (a+ a) 1 F 1 (a, y, x) - x(y - a) 1 F 1 (ac, y + 1, x), 
a x Fi(a + 1 , 7 , *) = (* + 2 a - 7 ) xF^a, y, x) + (7 - a) x ^(a - 1 , 7 , x), 
(y-a)x l F l (cc,y+l,x) = y(x + y-l) 1 F 1 (at,y,x)+y{l-y) l F 1 (a,y-l,x). 

( B.A. Report, 1926.) 

00 t n 

2. Prove that exp (— £**+xt - \t 2 ) = S — DJx). 

n- o«! 
roo 

Hence prove that e iKa D n (x)dx = 2^Jni n D n (2K). 


3. Prove that for real a, b, c the hypergeometric series converges at z = lifc>a + 6 and diverges 
if c <0 + 6 ; and that it converges at z = — 1 if c + l>a + 6 . 
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Chapter 24 

LEGENDRE FUNCTIONS AND ASSOCIATED FUNCTIONS 


‘You boil it in sawdust; you salt it in glue; 

You condense it with locusts and tape; 

Still keeping one principal object in view. 

To preserve its symmetrical shape.’ 

lewis carroll, The Hunting of the Snark 

24*01. Associated Legendre functions. We have seen that the solutions of the 
potential, wave, and sound equations for spherical boundaries depend on the solution 
of the equation 

(l-/i 2 ) < ^-2(s+l)fi < ~ + {n-s){n + s + l)® = 0 , ( 1 ) 

where — Potential problems concerning the outside of spheroids depend on 

the same differential equation, with fi> 1 for prolate spheroids and pt purely imaginary 
for oblate ones; in these problems we require the solution that tends to 0 when fi 
tends to oo. Except when the contrary is stated, we shall take n, s to be integers, 
WtfsOO. Then one solution has index 0 at ju, = ± 1 and the other index -s. The 
latter may contain a logarithm, and in any case will be infinite. Since one solution is an 
odd and the other an even function of pt, the solution with index 0 at fi = 1 will also have 
index 0 at [i — — 1 , and will be the solution needed for problems of spherical boundaries. 
It can have no other singularities and is therefore an integral function. Series solutions 
can be found easily. They are given explicitly for s = 0 in 16-04. An expression in finite 
terms for the solutions analytic at pt = ± 1 is found by the method of 18-061 to be 




(2) 


We can build up the 
in (1). We have 


solutions with singularities at ± 1 from those found by taking 8 — n 

(3) 


One solution of this is a constant, and successive integrations will build up polynomial 
solutions for smaller values of s ; but these are already given by (2) without the need of 
special attention to fix the constants of integration. The other solution is given by 


d 1 

dpt n (a 2 —l) n+1 ’ 


(4) 


and if we choose the constants so that the solutions tend to 0 at fi = oo they will be 

(/1 — u ) n ~ 8 


e»=-^_r 


du, 


,, (u 2 — l ) n+1 

where the path does not cross the real axis between — 1 and 1. 


( 5 ) 
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Potential in a cavity 629 

The indicial equation for fi large has roots n — s and — n—s — 1; the former corresponds 
to (2), and the latter shows that we are justified in assuming for all 8 the existence of a 
solution vanishing at infinity. It is easy to verify that (5) satisfies 


d/i 


K = e s n +1 


( 6 ) 


and therefore it must be the second solution required. If we take n = s = 0 it becomes 


0° - ilog^±|, (7) 

so that there are logarithmic singularities at [i = + 1, and a special convention will be 
needed to give a definite value to the function for — 1 ^ [i ^ 1. Except in this case we can 
agree to take the value given by (5) for fi real and greater than 1 and define the function 
by continuation, excluding the real values of jjl between -1 and +1 by a cut. For 
we shall see that a modified definition is possible, but in any case the second 
solution will become infinite at /a = + 1. 


24*02. Solutions of Laplace’s equation in spherical polar coordinates will 
then be 

<f> = (r n , r~ n_1 )sin s <9.0. (cossA, sinsA) (8) 

where 0 is given by either (2) or (5), with an appropriate constant factor; for a complete 
sphere the single-valuedness of $6 will require that for s = 0 the term in A, which replaces 
sinsA in this case, will not occur, and the finiteness of (j> at 0 = 0 or tt will exclude the 
solution (5). We shall see that any function satisfying Laplace’s equation inside or outside 
a sphere can be expressed in terms of spherical harmonics, © being taken as in (2). 

24*03. Potential in a cavity. Consider a closed surface not surrounding any matter. 
Within it a potential function exists satisfying Laplace’s equation. We can draw a sphere 
about any point in this region and lying wholly in the region; and within such a sphere 
the potential is given in terms of its values on the sphere by Green’s integral 6-092 (8). 
Now if we take r < a, 

a 2 — 7 * 2 a 2 — r 2 

R z (a 2 — 2 ar cos&+ r 2 ) 8/a ' 

If we for a moment regard r as a complex variable, this function and any of its derivatives 
have singularities only at r = ae ±t& , and therefore have expansions in power series in r, 
uniformly and absolutely convergent with regard to r for | r | ^ c <a. But the terms in 
r n are the sum of terms of the form (r 2 ) m (r cos#)*- 2 ™, and r cos-0 is a linear function of 
x > V> z - Hence for any 0 integrable over r = a, the terms in r n can be expressed as a homo¬ 
geneous polynomial in x, y, z; and the series is uniformly convergent with regard to & for 
I r I < c - Hence <j> within the cavity can be expressed as the sum of a series of homogeneous 
polynomials in x, y, z, valid for r < a, and any derivative of <f> of any order with regard 
to x, y, z can also be so expressed. Hence sufficiently near the centre we can write 0 as 
the sum of a series of homogeneous polynomials in x, y, z 

0 = 0O + & + 02 + ••• + 0 » + ...» 


( 1 ) 




630 Polynomial solutions 24*04 

fa being of the nth degree. Since V 2 <p = 0 for all points in the sphere, V 2 ^ n = 0, by equating 
terms of equal degree. Thus can be expressed as a series of polynomials each satisfying 
Laplace’s equation. Obvious solutions are 

fa = 1; fa = X, y, z; fa = xy, yz, zx, x 2 — y 2 ,2z 2 — x 2 — y 2 . 

Now when z = 0 let fa = d( x >y)> “g~ = H x >y)- (2) 


<j) n can be expressed as a terminating Taylor series in powers of z. But 


02r 

dz*- 




/ 3 a 3 2 V 3 2r+1 / 3 2 3 2 Y 

( “ 1)r (^ 2 + 3 y 2 ) 3 ^" = ( ~ 1)r (^ + 9 ?) 


dfa 

dz 


(3) 


and hence the derivatives with regard to z at z = 0 can all be found by differentiating 
g(x, y) and h(x, y). If we write 


__i_ ___ v? 

dx 2 3 y 2 


(4) 


y 2m 


fa = 2 (- 1 )“ y) — + 2 (- 1 )-» Vf» h(x, y) 


' (2m +1)! 


(5) 


and fa is completely determined given g{x,y) and h(x,y). But g{x,y) is a polynomial of 
degree n and therefore contains n+1 terms; h(x, y) is of degree n -1 and contains n terms. 
If we substitute ( 5 ) in V 2 0 n , with arbitrary coefficients in g{x, y) and h(x, y), we find that 
V 2 fa = 0 . Hence exactly 2n +1 coefficients can be assigned independently, and fa can 
be expressed in terms of 2n +1 linearly independent polynomials. 

If we take 8 = 0 , 1 , ...,n in 


r^sin*# 


d n + 8 

dfjb n+8 


(fi 2 — l) n (cos sA, sinsA) 


(6) 


we have 2n+\ solutions, which are clearly independent since none of cos sA and sin sX 
can be linearly expressed in terms of the A factors for other values of a. The solutions are 
expressible as polynomials in x, y t z. For r 8 sin s $(cos sX, sin sA) are the real and imaginary 

d n+ * 

parts of (x + iy) 9 and therefore are polynomials. Also ^(/i 2 - l) n is a polynomial in /i 
of degree n — a of the form 'ZA m ju, n ~ 8 ~ 2m ; and 

fOi-a^n-a-im _ (r/i) n_ 8 _ 2 m r 2wl = z n ~ 8 ~ 2m (x 2 + y 2 + z 2 ) m . 


Hence the solutions ( 6 ) are 2n +1 linearly independent polynomials. Any polynomial of 
degree ninx,y,z that satisfies V 2 <f> = 0 can therefore be expressed linearly in terms of them. 


24*04. Solid and surface harmonics: explicit forms for 
equation can be written 


d_ 

dr 



3 / Q d<f>\ 3V 

+ sin Odd \ SU1 °dd) + sin 2 #3A 2 


= 0 . 


Laplace’s 


If 0 = r n S n (d, A) the first term is n{n+ 1)0, and the coefficient is unaltered by changing 
n into - n -1. Hence if r n S n {0, A) is a solution, A) is another, and conversely. 

Such solutions are called solid harmonics of degree n or -n—\ as the case may be, and 
8 n is called a surface harmonic. 





24*04 Derivatives of 1 jr as solutions 631 

This fact leads to another way of developing the standard solutions. For since 1 jr is 

a solution of Laplace’s equation, of degree — 1, any derivative ——— - is another of 

J dx?dy m dz n r ’ 

degree If we multiply it by r 2l + 2m + 2n + 1 we shall therefore get another, of 

degree l + m + n. Hence the functions 


y2Z+2m+2n+l 


0i+?n+n 

dx?dy m dz n \r 


( 1 ) 


constitute a set of solid harmonics of degree l + m + n. Those of given degree are not 
independent, since, for instance, 


5 9 2 1 _ a 2 1 K 02 l l 

r 0^r +r V; +r Vv = r = °‘ 


( 2 ) 


It is easy, however, to obtain an independent set in terms of them. We take 

vs ( d , ■ 9 W9\ n "*l 

(&) 7 - ( 3 ) 

This is a solid harmonic, being a linea*combination of derivatives of 1/r. Suppose that 
z 2 >x 2 +y 2 . Then 


Now 


/ 2 «-i ' mi z 2”+i ' 

(wx +i ^) F(wi) = + 

id .a\ 

\fa + %) g{pC + iy>l ^^(x + ^-g'ix + iy) = 0. 


(4) 

(5) 

(6) 


9 9 

Successive operations therefore introduce powers of x+iy, but further differen¬ 

tiation of these gives nothing; and 


\dz) r ' ' z n+1 2 m ml (2m)! z m + 1 * 

(» +i »Y(»)~ i 

\3a; ay/ \3z/ r 


(7) 


= S (-1 )»-*+« 13 - • 2m ~ 1 2 ‘ m! (2m + n-s )! (rr + ty) 8 (x 2 + y 2 )™~ 8 

2 m m\ (m—$)! (2m)! 2 2m+n-«+i (s>0). (8) 

Every term contains the factor (x + iy) 8 and therefore e** A . The lowest non-zero term has 
m = 8, and reduces to 

/ 1 w (»+*)i(g+*y) t 

v > os»» • (9) 


Hence 


2®s! z n+8 ~ i-i 

its / i fa + 8 ) I sin*^e^ A , _ 

K-n-i = ( _ l)--^r- (l + O(em 2 0)). 


( 10 ) 


Since this is a solid harmonic of degree —n—1 proportional to e***, and of order 


sin 8 ^ for 6 small, r n + x K 8 _ n _ 1 must be a constant multiple of sin 8 de* 8 * — * + (/l* — l)». 




632 Explicit form for p s n {fi) 24*04 

Differentiating by Leibniz’s theorem and picking out the only term that does not 
vanish at fi = 1 we have 

d n+a 


d/i n+s 


.{(/»-!)*(/» + 1 )“} = + 1 )”-+... 


w!s! ’ (n — s)! 

2 n ~ s n ! (n + s )! 


(n — s )! s ! 


+ 0 (sin 2 0 ), 


and 


rn+'KLn-i. = ( 1 2 ^! •~ sin8 ^ et ' 8A ^^ ( ^ 2 ~ 1)n - 


( 11 ) 


( 12 ) 


This form suggests the most convenient way of assigning the constant factors in the 
standard solutions. We shall take 


^- sin 8 6(it 2 — 1 ) n 
PnW - 2 n( w [)2 s U dfi n+s ' U *' 


(13) 


The usual definition is to take the constant factor as 1 / 2 n n ! and call the resulting function 
P 8 n (/i). The present form has one considerable advantage in symmetry. With it, let us 
see what happens if we replace 8 by —8. Since fr 


(14) 


(15) 


0Vi\ _ /a *1\I 

dz 2 \r) \dx + l dy)\dx i dy]r’ 

we can make the interpretation 

Id . a y d n +* /1\ ._ /9 .9V d n - A\ 

\dx + l dy) dz n +*\r) ' \3a? % dy) dz n - 8 \r) 

= (-1 

where the asterisk denotes that we replace iX by — iX. Now see whether we get the same 
relation by taking —8 for s in (13); we have 

d n ~ 8 

2 n {n\) 2 p~ 8 {[i) = (w + s)!(l-^V 1/28 ^=i{(^“l)(/ t +l)} n 

, , v, /, 2 ,_ 1/28 V fo-*) 1 n\([i+\) 8JeJm 

(% + «).( fi ) m 5 0 m!(w — s — m)\ (n — m)\ (s + m)! * 

where terms with m>n — 8 vanish. Hence all terms contain (fi 2 l) s as a factor, and the 

SUm IS «_•/.. .\i.. i_l/.. i \n.—m—s l,, i 1 \m 

W +l) (16) 


„ vl , n ~* (n — s)\n\n\(ii— 1 )’ 
(-l) 8 (n+s)\{l-/i 2 )^ 8 S — 


m\(n—8—m)\(n — rn)\{s + m)\ 

.. n+a (w + s)! w!(/i-l) n_w *»!(/i+l) _a+m 

But 2= (»-«) ! (1 - A ) , 3 S m 5 om ! (re + s _Sj! (»_*»)! (-s + m)! 

and terms with m<8 vanish; putting m = 8 + u we have 

(n-s).(l /O u 5 o (3 + ^)!(^-«)! u\ 

which is the same series as (16), terms with u>n — s vanishing. Hence 

PnW = (“I YpW- 


(17) 


(18) 


( 19 ) 
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In our original derivation we were not concerned with negative values of 8, but we should 
naturally like p 8 n exp i(yt — sX) and 2>“ 8 expi(y£ + sA) to represent two waves of equal 
amplitude, passing around a sphere in opposite directions. With the present definition 
this condition is satisfied, and apart from it there appears to be no reason for considering 
negative 8 at all. With the usual definition, which omits the factor (n — s) \/n\, the ampli¬ 
tudes expressed by the usual solutions are very different. For n = s = 4 the ratio is 
8! = 40320. The factor ( — 1)® presents a minor difficulty in securing symmetry, and 
could be absorbed by including a factor i 8 , but this would make the functions imaginary 
for odd 8 and does not seem worth while. C. G. Darwin* has already introduced the 
factor (n — s )! into the definition for the sake of symmetry, but it seems best at the same 
time to divide by n ! so as to retain the usual standard solutions when 8 = 0. 

Hobson associated the factor (— l) 8 with P®(/i), not with P~ s (/i). In this respect he 
is followed by Condon and Shortley,f who use normalized functions. 

A co mm on modem procedure is to normalize the functions, that is, to introduce a 
constant factor so that the integral of the square of the function over the range used is 1. 
This device simplifies the writing of general proofs in, for instance, the theory of integral 
equations. But in a simple application it would mean that we must not use cos x and sin a; 
we must use (2/7r) 1/a cos x and (2/7r) 1/a sin x if the range used is n, and 7r _1/a cos x and 7r~ 1/a sin x 
if the range is 27T. Presumably separate tables would be wanted in the two cases. For 
more complicated functions the normalizing factor introduces square roots everywhere, 
and especially it needlessly complicates' the recurrence relations. When the range is 

infinite the normalizing integral may diverge ( e - g - JT xJ\{x) dx^ and other devices are 

needed. We shall therefore take as the standard functions of the first kind (i.e., behaving 
like sin 8 0 near 0 = 0 and n) 


PW = 


(n — «)! 

7l! 


n(/o 


(n — s)! . _ d n+8 

1 _—-sin®0_ 

2«(ti!) 2 dp n +* 




( 20 ) 


and the corresponding solutions of Laplace’s equation are 

(r n , r~ n ~ 1 ) p 8 n {fi) (cos sX, sin sX) (21) 


related to the solid harmonics of degree — n — 1 by 

- (s +< J)‘(s)’ *0 = <_ 


( 22 ) 


The functions p%{[i) are usually called the Legendre functions or polynomials, or zonal 
harmonics’, it is usual to suppress the explicit mention of s when it is zero. p 8 n is called an 
associated Legendre function, and p 8 n (cossA, sinsA) tesseral harmonica, apparently after a 
kind of dice known to the Romans. If s = n the tesseral harmonic is called a sectorial 
harmonic. 

It is important to have a general idea of the appearance of the functions. By the 
general principle that the zeros of a derivative of a continuous function separate those of 
the function, since (/i 2 — l) n has n zeros at +1 and n at — 1, its first derivative has n— 1 at 
each of these values and one between, its second two between — 1 and + 1, and p n has n, 

* Proc. Roy. Soc. A, 118, 1928, 668. 
t The Theory of Atomic Spectra, 1935, 62. 
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Expansion of (r 2 —2rh cos 0+h 2 )~ lk 

all real and between +1 and — 1. Powers of sin# never vanish except at /i = + 1 in this 
range, and for s > 0, if we do not count the zeros at 6 = 0 and 7r, p s n will have n—s zeros, all 
real. Thus the zonal harmonics keep the same sign each over n+1 belts, as between 
parallels of latitude, counting each polar cap as a belt; each increase of 8 by 1 reduces the 
number of parallels where the harmonic vanishes by 1, but increases by 2 the number of 
meridians where it does so. We see easily from 

on differentiating by Leibniz’s theorem that all terms but one vanish at fi = ± 1, giving 

Pn( 1 ) = 1 , Pn(~ 1 ) = (-!)*. (23) 

Actually \p n {fi)\ never exceeds 1. For [i near 1 the lowest term in p 8 n is got by 
taking u — n — 8 in (18); we get 


24*05. Expansion of (r 2 — 2rh cos 6 + h 2 )~ 1 h : Green’s function for a sphere. The 
harmonics with 8 = 0 are particularly important, since many disturbances are sym¬ 
metrical about an axis. We have 


(d\ n l 

\di) r = (_1 ) nw!r_n_1 ^(^)- 

Consider the function 

1 _ 1 _1_ 

E {x 2 + y* + (z — A) 2 } 1/a (r 2 — 2rh cos 0 + h 2 ) 1 ^ ’ 


( 1 ) 

( 2 ) 


which has a convergent expansion in negative powers of r if h < r. But since R involves 
z and h only through z — h. 

/3\ nl / 3\ w l ■ /ox 

(a h) E~( dz) R* (3) 


and by Taylor’s theorem 


S = S 3(“3”IL.„ - ZjSiifiM- 


(4) 


This expansion is often taken as providing the definition of P n (ju,) (= p n (/ 1 )). The explicit 
form of P n {fi) can be found from it quite easily by Lagrange’s expansion.* But there are 
few practical cases where this expansion yields explicit expressions for the general term 
without great difficulty, and we prefer to regard it as an existence theorem, in spite of this 
solitary instance of its use. In practice the associated functions are extremely important, 
and it seems best to have a definition that can deal with them from the start. 
If0<a<l, 

(1 — 2a oos 6 + a 2 )~ 1/a = (1 — <xe ie )- l h (1 — ae~ id )~ l h 

= ^1 + \cLe ie + a 2 e 2id + ... j ^1 + \cter ie + tl a 2 e -2<0 +.. } (5) 

and the coefficient of a n is a sum of cosines, all with positive coefficients. It is therefore 
greatest numerically if they are all ± 1; hence 


|pJcos#) I ^p n (l) = 1. (6) 

* Cf. Jeans, Electricity and Magnetism, 1908, 215. 
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A related series of much importance is 

s = ^(2n + l)~pM- ( 7 ) 

Differentiating (4) and multiplying by h we have 

h ^hi =I,n ^i PM (h<r) ’ 

and therefore 

M 2 4 +i 

which is the function that arises in the determination of a potential function given its 
values on a sphere. If h > r the same expansions hold with the exception that h and r must 
be interchanged. 

24*06. Potential outside matter. To the theorem about expansibility of a potential 
function in a sphere within a cavity corresponds one about expansibility outside a sphere 
that contains the whole of the matter whose potential is being considered. Let P(x t ) be 
a point outside such a sphere, the origin being the centre, and (?(£*) the position of a mass 
dm; then the potential is yjdm/B, where 

&=p* = s, (i) 

5 ( 2 ) 

which is uniformly convergent on and outside the sphere, and therefore can be multiplied 
by dm and integrated term by term. Thus the potential is developed in a series of negative 
powers of r, which can be differentiated term by term as often as we like provided that r 
is greater than every value of p. Hence V 2 ^ is another convergent series, and is identically 
zero, and the terms in <j> of every degree separately must satisfy V 2 0 = 0. But if r~ n ~ 1 f(6, A) 
satisfies Laplace’s equation so does r n f(0, A), and therefore f{6. A) is linearly expressible 
in terms of our solutions. Hence <j> can be expressed by a series of the form 

$ = ^ + S £ “Pi Pn(p) fans cos sA + sin sA). (3) 

The condition that r is greater than every value of p is sufficient for the existence of 
this expansion, but not necessary. Consider any distribution of matter within a surface S, 
the maximum of p on which is c. Take a further surface S' outside it. The field outside 
8' is the same, by the theorem of the equivalent stratum, as that of a suitable distribution 
of sources and doublets over 8', which does not need to be a sphere. Let a be the maximum, 
6 the minimu m, of p on 8'. If the condition that r must be greater than every value of p 
was necessary to the existence of an expansion of the form (3), the expansion of the 
potential due to the distribution on 8' would exist only for r>a. But it is the same as the 
potential due to the distribution within 8, which has an expansion for all r > c. If then 
6 > c the expansion will exist right down to S' even though S' is not a sphere. It is therefore 
possible for the potential outside a surface with matter on it to have an expansion in 
negative powers of r without the surface being a sphere, so that the expansion is being 


(r 2 — 2 rh cos 6 + h 2 ) 1 ^ (r 2 — 2 rh cos 0 + A 2 ) 8/a ’ 


( 8 ) 
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applied at some places where r is less than the largest value of p. This feature in potential 
theory is the analogue of analytic continuation in the theory of the complex variable; the 
expansion will exist if the potential outside S' is the same as that due to some distribution of 
matter within a surface inside S' such that the largest value of p on this surface is less than 
the smallest on S'. It is particularly important in problems relating to boundaries that 
are not exact spheres, in particular in the theory of the figure of the Earth. The external 
potential can then often be continued into the body, but the result will not be the actual 
potential in the body, since the continuation of the external potential, if it exists, will 
satisfy Laplace’s equation and the actual potential within the body will not. There is also 
an analogue of singularities. In the case of a charged sphere with a projecting point on 
it there will in general be a local concentration of charge on the projection, and the 
potential due to this cannot be represented by that due to any internal distribution. 

24*07. Orthogonality relations: expansion theorem. Given that an expansion 
exists, it can be determined by considering the values of the function over a sphere. 
All our standard solutions are mutually orthogonal in the sense that the product of any two 
of them, multiplied by the surface element dS, and integrated over a sphere, gives 0. First, 
let S m and S n be any two surface harmonics of different degrees. Then <j> m = r m S m and 
<f> n = r n S n satisfy Laplace’s equation. Therefore if we apply Green’s theorem to a sphere 
of radius a 

//(*•£-*•£)«-* a) 

But this is the same as 

J/S m S n dS, (2) 

and the first factor cannot vanish if m 4= n. Hence if m #= n 

jjs m s n d8 = 0 . (3) 

Also, for harmonics of the same degree, any pair of p s n cos sX, p 8 n sin sX, p* n cos tX, p l n smtX 
are orthogonal since the integral with regard to A vanishes, except in the case where s — t 
and we take either the cosine factor in both cases or the sine factor in both and are there¬ 
fore integrating the square of a harmonic. 

Since we can take S m = p 8 m cos sX, S n = p s n cos sX, it follows that if m 4= n 

^p 8 m p 8 n cos 2 sX sin OdddX = 0, (4) 

and therefore PmPndp = 0 (m^n). (5) 

Linear independence between the standard harmonics follows immediately, though we 
have already verified it by another method. For if we denote any of our harmonics by 
Y p and there was a general relation 

= 0 , 

we could multiply by any Y q with a non-zero coefficient and integrate over the sphere, and 
the result would be 0. But every term separately gives 0 by the orthogonality relations 
except Y q , which gives a q jjY\dS, and this cannot vanish since by hypothesis a q - f=0. 
Hence the assumption of any linear relation between the harmonics leads to a contra¬ 
diction. 
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It also follows that every zonal harmonic p n is orthogonal to every polynomial in fi of lower 
degree. For if f{p) is a polynomial in p of degree m<n, the term of highest degree being 
we can subtract such a multiple oip m as will remove this term. We can then subtract 
such a multiple oip m _ x as will remove the term in fi m ~ 1 in the remainder, and so proceed. 
The process ends in m +1 steps and a sum of multiples of zonal harmonics of degrees < m 
is found that is identically equal to f(p). But each of these harmonics has degree different 
from n and therefore is orthogonal to p n . Hence 


j: 


MPniP) = 0. 


Now suppose that a function f{6, A) has an expansion in spherical harmonics, so that 

( 6 ) 


oo n 

f{d,X) = s S (a ns Pn cos sA + b ns p 8 n sin sX), 

n= 0s=0 


where we suppose that/(0. A) is known, and require to determine the coefficients a b 

718) 718) 

assuming that the expansion exists. Multiplying by the respective harmonics and 
integrating with respect to sin QdddX we have, by the orthogonality relations, 

fl p2w pi p2?r 

®noJ_Jo /(0> *) d /tdX, (7) 

pi p pi p2?r 

a nsj J (pfficosPsXd/idX =J I f(d, A) p 8 n cos sXdpdX, (8) 

pi p2 if pi p2ff 

bnsj J (pf ,,) 2 sin 2 sXd/idX =J J f(0, X)p 8 n sin sXdpdX. (9) 


The coefficients are therefore determined in terms of definite integrals. For those on the 
left, integration with regard to A gives 2n ■ or tt. Also 


/lx - 25^!?/-. (£ (/t> ~ ^ 


( 10 ) 


Integrate by parts n times; the integrated parts all vanish at both limits and we are left 
with 

1 pi d 2n (2n) I Pi 

2 2n (n !) 2 ( “ 1 ) J _ x ~ ^dpF *“ X ) nd/A = ¥^ 2{ - 1)n \_.y- 1)nd t l 

(2n )! C' hir 

2(2n)! 2n.2n — 2...2 2 

“ 2 2n (n\) 2 2n+1.2n-l ... 1 = 2n+l' ^ 

The corresponding integration for (p s n ) 2 can be simplified a little by using p~ 8 . We have 
iPnfdfl = (“ 1 )°j ^PnPn'd/* 

f P 1 //w-'f® —8 

= 1)n ! ^ n "' 0 s^> ^ - 1 rd/> 

(-l) 8 (^-s)!(n + a)! f 1 d n+t , „ d n ~* 


2 2n {n\) A 


x. 


dp n+ * 
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Integrate by parts 8 times; we get 


24 071 


(n — 8 )! (n 4- s )! 


2 * n (n\y 




(n — 8 )! (n 4- s)! 




Hence 


( n ‘) 2 

l*n rin Arr 

J.J. 


(Vnfdfl = 


2 (n — s)! (w + s)! 


2n+ 1 


(n\y 


( 12 ) 

(13) 



(l4) 2 ( cos2 s ^> si 112 s ^) sinddddX 


2n (n — s)! (n + s)! 

2n+l (ft!) 2 


(14) 


whence the coefficients can be found from (7), (8), (9). 

It will be noticed that this expansion has the property, like the Fourier expansion and 
all other expansions in orthogonal functions, that if E is the sum of any finite number of 
terms of a series of harmonics, with arbitrary coefficients, and we adjust the coefficients 
so as to make JJ(f(6, A) — E) 2 dfidX a minimum, the resulting coefficients are the coefficients 
a ns> bns- There is an immediate analogue of Parseval’s theorem, 


f [{/(<?> X)Yd(idX = S E f f (Pn) 2 c °s 2 sXdfidX + S E Ks ff (p s n ) 2 sin 2 sAdfidA, 

JJ n=08=0 JJ n-Os-O JJ 

expressing that the mean square of / over the sphere is the sum of the mean squares of its 
harmonic components. 

24*071. The above argument assumes that the expansion exists. We have proved 
this only for the potential over a sphere such that matter is either all exterior or all 
interior to it. Extensions to more general forms of f(6, A) can be made in various ways, 
as for Fourier series. A proof, on the supposition that f(6, A) has continuous second 
derivatives, is given by Courant and Hilbert.* If f(6, A) does not satisfy this condition, 
but nevertheless can be uniformly approximated to over the sphere, except possibly in 
a set of points capable of being enclosed within an arbitrarily small total area, by 
functions that do satisfy it, it will follow immediately that a series of the form (6) 
exists that will agree with f(6, <f>) to any assignable accuracy, except in the region 
excluded. Such a set of functions can be assigned in many ways; one is by an extension 
of the argument of 14*08, but perhaps the simplest is to note that if f(0, A) is the 
potential on a sphere of radius a, and J J f(d, A) sin. Odd dX over the sphere exists, we can 
take a set of interior concentric spheres of radii a — 8 n , where 8 n -> 0, and the potentials 
over these spheres have derivatives of all orders. Further, by a similar argument to 
that of 14*05, we can show that as $ n -> 0 the potentials on these spheres tend uniformly 
to f(0, A) in any closed region of d, A such that f(d, A) is continuous. Consequently a 
sufficient condition that f(d, A) can be approximated to by a series of surface harmonics 
almost everywhere is that it shall be integrable over the sphere. 

As for Fourier series, this type of approximation is possible in some cases where 
there is no expansion of the form 24*07 (6). Conditions that 24*07 (6), with coefficients 
given by (7), (8), (9), may converge to f(6, A) are more difficult to state than for Fourier 
series if f(6, A) has discontinuities. 

* Methoden der Mathematischen PhysiJc, 1, 1924, 421-22. 
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24*08. Explicit forms of the functions p a n , up to n = 4, are as follows. The polynomials 
obtained by multiplying by r n (cos sA, sin sA), and the mean square values of the functions, 
associated with their factors in A, over a sphere are also given. 





Mean 

n 

a 

Pn 

square 

0 

0 

1 

1 

1 

0 

cos 6 

1 


1 

sin Q 

i 

3 

2 

0 

f cos 2 0— \ 

1 

5 


1 

f cos# sin# 

3 

20 


2 

§ sin 9 # 

3 

5 

3 

0 

f cos 8 6 — ■§ cos d 

A 

7 


1 

£sin0(5cos 2 0— 1) 

2 

21 


2 

f sin 2 6 cos 6 

5 

21 


3 

fsin 8 0 

10 

7 

4 

0 

J(35 cos 4 6 — 30 cos 2 6 + 3) 

1 

9 


1 

f sin 0(7 cos 8 0 — 3 cos 6) 

5 

72 


2 

f sin 2 6(1 cos 2 6— 1) 

5 

36 


3 

^sin 8 6 cos 6 

35 

7 2 


4 

^sin 4 0 

3A 

9 


Polynomials 

1 


x, y 

\(2z 2 — x 2 — y 2 ) 
hx, |zy 
f(x 2 -f/ 2 ), 3 xy 
%z(2z 2 — 3x 2 — 3y 2 ) 
ix(4z i -x* — y z ), §y(4z 2 -x 2 — y 2 ) 
§z(x*-y z ), 5xyz 
1(^-3 xy*), |(3 x*y-y*) 
s{8z 4 - 24 z\x 2 + y 2 ) + 3(x 2 + y 2 ) 2 } 
$x{4z* - 3 z(x 2 + y 2 )}, |h/{(4z 3 - 3 z(x 2 + y 2 )} 
f (x 2 — y 2 ) (6 z 2 —x 2 — y 2 ), %xy(6z 2 — x 2 — y 2 ) 
*£z(x*-3xy 2 ), S£z(3x 2 y-y*) 

¥( x 4 — foc+y 2 + y 4 ), *£xy(a? — y 2 ) 


In the mean square values, when s=t=0, the factor ^ obtained by averaging cos%A or sin 2 sA 
has been taken into account. 


24*09. Analogue of Laurent’s theorem. The theorems of 24-03 and 24-06, 
relating to the expansions of the potential in a spherical cavity or outside a sphere, 
can be extended immediately to the case where <f> satisfies Laplace’s equation in 
the region between two spheres, one inside the other. We can apply the theorem of the 
equivalent stratum to the region in question: the potential will be the sum of those due to 
distributions over both the inner and the outer spheres, and can be represented by a series 
of solid harmonics, but these will now include both positive and negative powers of r. 
This is the spherical analogue of Laurent’s theorem, and has been much used in terrestrial 
magnetism. Part of the* variable part of the magnetic field at the Earth’s surface is due 
to electric (ionization) currents in the upper atmosphere, part to currents in the Earth. 
The former will give a potential at the surface expressible by a series of solid harmonics 
of positive degrees, the latter a series of negative degrees. The variation of the potential 
over the surface can be found by integrating the horizontal intensity of magnetic force, 
and the vertical intensity can be measured directly. Now if the potential is 

# <f> = 

the vertical intensity is 

- % = S { - nA n r"-i + (n+l) r—*} S„. 

The terms in each harmonic for r equal to the radius of the Earth being found from 
observation, the coefficients give a pair of equations for A n and B n , from which it can be 
determined how much of the field is due to external and how much to internal currents. 

24*10. Recurrence formulae. From 


(1 - 2 fjLtx + a 2 ) _1/a = hp n (p) a n , 


( 1 ) 
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by differentiation with respect to a, we get 


2410 


fl — CL 

(1 — 2pet, + a 2 ) 8/a 


21 np n <x n ~ 1 . 


( 2 ) 


Multiply (1) by (p — a,) and (2) by 1 —2/ta + a 2 , and compare coefficients of a n . We find 

{n+l)p n+1 -{2n-\-\)pp n + np n _ 1 = 0, (3) 

a recurrence relation connecting three consecutive zonal harmonics. 

Differentiating (1) with respect to p we have 


(1 — 2pcx, + a 2 ) 3/2 dp * 

(4) 

and comparison with (2) leads to 


JP» <^-1 

* dp dp ~ nP *• 

(5) 

If we now differentiate (3) and eliminate pdpjdp we get 



(6) 

f ^ 

and hence (2n + 1) J p n dp = ^ n+1 

(7) 


These can be generalized by differentiation to give recurrence relations between the sth 
derivatives, and hence between the p 8 n * 


(2n+l) 


d^Pn 

dp 8 - 1 


d s Pn +1 d*p n -1 
dp 8 dp 8 * 


(n-s + 1) 


d s p n +1 

dp 8 


= (2n+\)p 


d s p n 

dp 8 


-(n + s) 


d S Pn-1 
dp 8 


( 8 ) 

(9) 


Direct relations between the p 8 n are probably less convenient, since it is desirable to keep 
the sin 8 # factor outside the differentiation. The following formula, however, is easily 
proved from 24*04 (3) and has the peculiarity that s does not appear in the coefficients: 


P 8 n+l = Wn- 


1 -p*dy 8 n 
n +1 dp * 


( 10 ) 


Other recurrence relations, the proofs of which present no difficulty, are as follows: 


U ~^lp = n ( n + l )Pnd/* = 1~ ^ Pn ~ 1 ~ Pn+1 ) 

= (n+l)(pp n ~p n+1 ) = MPn-l-PPn )• 
lif(p) = 0 for p < a, and = 1 for p > a, where | a | < 1, 


f 1 1 

I _ftp)Pn(p) dp = 2^+1 -Pn+l(*)} (^>1), 

J J{p)dp = 1 -a, 


* Adams, Collected Scientific Papers, 2, 243-96. 
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and the expansion of f{fi) in Legendre polynomials is 

£(l-a) + £ £ {Vn-lW-Pn+xWPnit 1 )- 

n-=l 

We can now give a few examples of the use of spherical harmonics.* 

24*11. Potential due to a uniform circular disk. We have seen (6 032) that the 
potential on the axis is 

= 27ryar{yj (b 2 + a: 2 ) -\x\], ( 1 ) 

where 6 is the radius of the disk and x is the distance from the plane of the disk. Replace 
x by r and expand in descending powers of r; this is valid provided r > b. 


<P = 2>ryor{l ^ 


, U-jW, 

+ 2 ! ' + 






= ”v <Tb \- r -i,^+-+ 


r* ml 


r 2 m 


•) 


m\ 


j2m—2 \ 

r 2m—14* • • • | • 


( 2 ) 


We now interpret r as the distance from the centre of the disk and introduce the polar 
coordinate 6 measured from the axis of the disk. The above form of 0 is correct only for 
6 = 0. But we know that the potential for r>b is expansible in a series of the form 
tn r- m ~ 1 S m} and by symmetry it must depend on r, 6 only. Hence for every value of 
r the expression of S m in terms of the standard functions can contain only p m . Further, 
Prn - 1 for 6 = 0, and therefore the only form of <j> that (1) satisfies Laplace’s equation, 
(2) has symmetry about the axis, (3) reduces to (2) on the axis, is 



1 62 , (-£)... (f-m)6 2wi - 2 

2.2!r 3 ^ a+ '” + m\ ^=I^ 2 m -2 + 


If r < b a similar expansion in ascending powers of r is possible. 



(3) 


24*111. Potential due to given surface density over a sphere. By the expansion 
theorem we can express a as a series of spherical harmonics 


°* = 2 2 a naP S n C °S sX + 2 2 bnsPn sin 8 

»-0s=0 n—0 s=l 


( 1 ) 


which we can write shortly as T>c n8 S Ti3 . The potentials inside and outside the sphere are 
expressible by series 


4>i 




-n-l 




(2) 


The potential is continuous on crossing the sphere. Hence when r = a both <f> 0 and <f> x 
must reduce to the potential on the sphere, and by the expansion theorem. 

Again, the discontinuity in d<pjdn is — Anya; that is, 


m ( d ±a 

\ dr Jr-*-a \ dr )r-*-a 


= — Anya. 


* Integrals of products of three spherical harmonics are given by Adams, loc. cit. pp. 343—400* 
J. A. Gaunt, Phil. Trans. A, 228, 1929, 192-6. 
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Again using the expansion theorem, we can equate coefficients of all the harmonics, and 

-(2n+l)^- s = -4^yc Ma , 

Ut 

a _ t> ±nya c ns 

N ".2»+l * 

Thus and <p 1 are completely determined. The treatment of all harmonics with the same 
n is exactly simil ar ; S n8 is therefore usually replaced simply by S n . But if we are given the 
values of cr over a sphere it will be necessary in any case to perform the expansion with 
regard to a as well as n before we have the answer. 

24*112. Potential due to given density within a sphere. We take 

P ~ ^ Prisons* 

where p^ will now be a function of r', the distance from the centre. Also 




Sns, ^ 0 =^ 1 - 


where A M will be a function of r, but B ns will be a constant since <f> Q must satisfy Laplace’s 
equation. Two methods are available. We can use the conditions that <p x must satisfy 
Poisson’s equation and <f> and 00/07’ must be continuous at r — a. Alternatively, the shell 
between r' and r' + dr' can be regarded as a surface distribution of density pdr'. The 
potential due to this, for r < r', is 


and for r >/, is 


Adding up for all shells, 


2n +1 


2n +1 


Prisons'*' ^ J f 


PnsSns r 'dr'i- 


, a C a ^Pn 8 ^n8 r ' n+2 j / 

& - iny {° ^r S ™ r '^ dr' + iny Zi dr '- 

Y1 / J r 271+1 r n+1 'J o 2 ti +1 r n ~ x 

24*113. Potential of a nearly spherical conductor. We take the equation of the 
conductor to be 

r = a(l+S ( 1 ) 

where the e M are constants, small enough for their squares to be neglected. We 
assume also 

^ = ^- + 2^-] S n9 . (2) 

<f> 1 will of course be a constant since the surface is supposed to be at uniform potential v. 
Since 0 O would reduce to its first term if all the were zero we can suppose that all the 
1 M (n > 0) are small of the same order of magnitude as the e na . There is a difficulty here 
about the substitution of (1) into (2) directly, because (2) is not necessarily true if r is less 
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Figure of the Earth 643 

than some value of r on the conductor. But we can take a sphere r = a + h, where h is 
positive and just large enough for the sphere to enclose the whole of the conductor. 
(2) holds outside this. Outside the conductor <j> is continuous, and so are all its 
derivatives. Hence on the conductor 


4> = Wr-o+rj^ ( a + h-r)+o(h ), (3) 

( a \«+l /jn+1 

a + h) Sna + S ^ + 1 ^ ns (a + A)«+ 2 + * ~ r ) + 

= ^n* + #)> (4) 

and we con apply (2) down to the conductor with a negligible error. Hence 
v = s ^ns{! - S (m +1) £ n8 

= ( n +l) e TW / ^n«} + ^^nfi / ^w + 0 (2 e2 )» (5) 

for an 0, A. Therefore ^ = (»+ l)e ns ^ 00> A w = v, 

^ # “®{r +S ( W+ l)(r) + ( 6 ) 

to the first order in the departures from a sphere. 

24*114. The figure of the Earth. In the problems just considered the form of 
the surface and the values of <j> or d<f>/dr over it are enough to determine <f> at external 
points. We now come to a problem where the form of the surface itself is to be found, but 
we know a great deal already about both <f> and d<fi/dr over it. The Earth is not quite a sphere, 
the chief departures being the elliptieity and the elevations and depressions of the solid 
surface above and below sea level. The distribution of density inside is not known directly, 
but a great deal can be found out about the gravitational field outside the earth and about 
its external form from observations of gravity at the solid surface. The departures of 
the outer surface, gravity, and the gravitation potential from symmetry about the 
centre are small enough for their squares to be neglected in a first approximation, the 
range of each being of the order of 1/200 or 1/300 of the mean value. 

The external gravitational potential can be written 



( 1 ) 


where/is the constant of gravitation, M the mass of the Earth, and U' satisfies Laplace’s 
equation and tends to zero like r -8 for large r provided that the centre of mass is taken as 
origin. The acceleration of a free particle is grad U. But the solid earth is rotating with 
angular velocity (o, and each particle of it has component accelerations relative to non¬ 
rotating axes 

(— (i) 2 x, — o) 2 y, 0) = — grad %o) 2 (x 2 + y 2 ) = — grad |wV 2 sin 2 6. (2) 


Hence the difference between the accelerations of a free particle and the ground is grad T*, 
Wher ® ¥= U + ^tohr 2 sin 2 0. (3) 


The function Y is called the geopotential. Observed gravity is its gradient, and the 
surfaces of constant Y are the level surfaces. 
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644 Condition satisfied by potential 24*114 

On the ocean surface Y is a constant, which we shall denote by C. Take a standard 
value of r, which we shall denote by a and define precisely later; at present we need only 
say that it is such that r over the Earth’s surface differs from it by small quantities of 
the first order. Then since Y —fMjr is of the first order, it and its derivatives vary only by 
quantities of the second order when we change r from a to a + r', where r' is of the first 

order * If j M 

r 


Y = 


: + £w 2 r 2 sin 2 0+ U ' = C-gh, 

(4) 

g 0 so that 


fM fM 

C ~ a ’ 9o ~~ a 2 * 

(5) 

fM -C or' 

—— o —g 0 r , 

r 

(6) 

■ & + —(|<y 2 r 2 sin 2 0+ U'). 

9o 

(7) 


where h is small, and we choose o, g 0 so that 


then to the first order 


and 


We shall show that to the first order h is the measured height above sea level. We shall 
denote the second term by h'\ it represents the departure of the level surfaces from 
spheres due to rotation and to the higher harmonics in the potential. 

In triangulation differences of level dh are measured upwards, normally to a level 
surface at the point of observation; therefore along a survey route the change of Y is 
—jgdh (the negative sign because g is the downward gradient of Y). Y is a single-valued 
function of position, but g varies, and therefore the measured height of a given place 
will depend somewhat on the route taken from sea level. The difference, however, is of 
the second order in departures from a sphere and will be neglected. Hence on the surface 

Y = C—gh, (8) 

where g is the local gravity and h the measured height. 

Gravity at the outer surface is given by 

0Y 


9‘ 


_ /0Y\ 2 /0Y\ 2 / 8Y \ 2 
\0r/ \r00/ \rsin0dA/ ’ 


■sinfldAj 

and the second and third terms are of the second order. Hence to the first order 

0Y fM dU' 


g= -* = 


= J — 7r — - 7 T-wV sin 2 6. 


dr 


(9) 


( 10 ) 


The relations between g 0 and mean gravity, and between a and the mean radius, will 
have to be found. Then from (4) 

fM 


and from (10), (11) g 


dU' W 
~dF + 


= g 0 {a- g jj-U'~ \a) 2 a? sin 2 0, 

/ 2 gh\ dU' 2U ' . , ... 

= g 0 11-)— 5 -2w 2 a sm 2 6, 

\ agj dr a 

— = 9o ~ 9 ( 1 + - 26> 2 o(i - cos 2 0). 

cl \ a f 


(ii) 


( 12 ) 


( 13 ) 
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This determines the left side in terms of g at the outer surface; but as it is small and the 
outer surface is itself nearly a sphere we commit only a second order error in supposing 
it to hold at r = a. The first order corrections have been taken into account by the terms 
in h and (o 2 . The observed value of gravity appears only in the expression g(l + 2h[a), 
which is practically the result of multiplying it by (a + h) 2 /a 2 , and would be the change 
required if gravity had been observed at the same height over the ocean surface and then 
the value at sea level was calculated from it according to the inverse square law. This rule, 
given by Stokes and later by Helmert, is known as the free air reduction. 

To solve, we first take the last term. If 


u’^-a-co^o), 

dU' 2U f 

= - 2a> 2 a{\ - cos 2 0) (r = a), 

and since |-cos 2 0 =-§(§cos 2 0-£) =-§p 2> 

U'i is a solution of Laplace’s equation. Then putting 

U' = U[+U' 2 , 


[ 2 h\ 00 

3 l 1 + ~a) = 'J'o +J^9n$ S n 


we find by equating coefficients 


TT' _ V ns a 
U 2 ^ r n + i ®ns> 


»- 2 ’ 


ffo = ro+l" 2 a» 

. _ a n+2 

~ ~ 9 nat 


J7' = ^\i- cn „2 

»3 


n—l 

oo 


n n +2 

(4-COS 2 0) + Y --- n S 

r » } n= 2 (n-l)r n + l9iusn * 


(14) 

(15) 

(16) 

(17) 

(18) 

(19) 

( 20 ) 
( 21 ) 

( 22 ) 


Thus U' is determined. The elevation of sea level above the standard sphere is h'. If in 
the interior of the land we take a point at a depth below the visible surface equal to 
the measured height h, this point also will be at a height W above the standard sphere. 
The locus of such points is called the geoid. 

Special interest is attached to the main ellipticity term. Returning to (4) we have on 
the geoid 

^+lfflW^W(J_oos ‘d) + U' = g 0 a, (23) 


whence 


lw 2 a 2 5 6> 2 a 2 „ ^ 00 

r — a = --+ --(4-cos 2 0)+ Y — £... 

3 7o 2 To 3 ' »- 2 (»-l)r. “ 




(24) 


The constant terms in r give the mean radius, in terms of which our a is therefore deter¬ 
mined. Denoting this by a 0 and ignoring now all terms butp 2 , we can denote the equatorial 
and polar radii by a 0 (l +^e) and a 0 (l—fe), and for a general latitude 


r ~ ®o(l + h) sin 2 0 + a o ( 1 - fe) cos 2 6 + 0{f) 
= a 0 {l + e(l - cos 2 0)} + 0(e 2 ). 


( 25 ) 
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646 Clairaut's formula 

e is the ellipticity. Comparing terms in | — cos 2 6 we have 

_ 5 ^ 

e_ 2%T + V 


(26) 


and gravity, including the main ellipticity term, is to the first order 

y 0 {l-(fm-e) (£-cos 2 0)}, (27) 

where m = w 2 a 0 /y 0 , to the present order the ratio of the acceleration at the equator due 
to the Earth’s rotation to gravity. This is Clairaut's formula. Actually the analysis of 
gravity leads to a better determination of e than survey does. The extension to higher 
harmonics is due to Stokes. 

The analysis leads also to a determination of the difference between the Earth’s prin¬ 
cipal moments of inertia. In MacCullagh’s formula (18*09) we know that the neglected 
terms are of order r -4 and therefore cannot contain any terms in the S^. Hence the term 
in ^ — cos 2 0 in. U' has an exact relation to the moments of inertia. If we take A — B 


I = A sin 2 

0 + C cos 2 6, A + B + C-3I = (C-A) (1-3cos 2 0), 

(28) 

the term in question is 


(29) 

where 

3C-A 

J ~ 2 Ma* * 

(30) 

But by (22) it is 

( r 3* r °+ r 3?») a eos ^>- 

(31) 

Therefore 

J = — — 2m = e — |w. 

(32) 


J is about 1/600. From the theory of precession of the equinoxes it is known that (C — A)/C 
is about 1/300, whence CjMa 2 is about £. This ratio is clear evidence of the increase of 
density of the Earth towards the centre. 

The neglect of second order terms makes it likely that the quantities calculated by the 
first order theory will be inaccurate by about 1 part in 300. Modem observational deter¬ 
minations are capable of giving them with a higher accuracy, and for this purpose it has 
become necessary to extend the theory to the second order of small quantities. 

24*12. Value of JJ^ n (cos 6) S n {0, A) do) over a sphere. We know that there is an 

expansion n 

s n = a n0 p n + X Pni^ns cos sA + sin sA), ( 1 ) 

8 = 1 

and therefore JJ p n (cos 6) S n da) = a«oJJ (Pn) 2 * ( 2 ) 

Also since all p‘ n vanish at 6 = 0 for s ^ 1, S n ( 0, A) = a n0 for all A; and a n0 is the value of 
S n {6, A) on the axis 0 = 0. 

24*13. Change of axes of a zonal harmonic. Let # be the angular distance from a 
fixed direction OP{6 , A). The angular coordinates of a general point on the sphere are 
{O', A'), and by a fundamental formula of spherical trigonometry 

cos# = cos 0 cos 6' + sin 6 sin 6' cos (A' — A). 


( 1 ) 
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Consider p n (cos$). This is a surface harmonic of degree n, since we could have taken OP 
as axis of reference. Hence it can be expressed in terms of the harmonics of degree n in 
0', A'. Take 

n 

p w (cos#) = Oop n (cos O') + 2 p® (cos 6') ( a 8 cos sA' + b s sin sA'). (2) 

8 — 1 

Integrating over a unit sphere we have 

jjp n (co8&)p n (coad')d(D = ( 3 ) 

Jjpjcos #) p® (cos 6') (cos «A', sin sA') do = («*> & 8 )* (4) 

But if in the result of the last paragraph we replace 8 n {6, A) by p® (cos O') cos sA', and 
6 by #, we have 


JJi > n( cos ^)P»( c °s^ , ) cos ^'^ = 2n+\ ^"( cos cos 5 A, 

with analogous relations; whence 


KA) = 2- 


nlnl 


(n — s)l (n+8)l 
except for a 0 , for which the factor 2 does not occur; and 


p^(cos 6) (cos «A, sin sA), 


( 5 ) 


( 6 ) 




p n (cos^) =p n (cos0)p n (cos0 , ) + 2 s S i ( w ~~^!(^ +a )! Pn(cos 0)p® (cos 0') cos s(A' - A) 


This result, due to Legendre, is often called the addition theorem for spherical harmonics, 
andp n (cos$) a biaxial harmonic. 


24*131. Derivation from two-dimensional transformation. In the wave mechanics of com¬ 
plex atoms it is necessary to study the transformation properties of jd*(cos 0)e <#A under a rotation. 
This can be done by applying the method of 4’ 102, relating a rotation in three dimensions to a unitary 
transformation of two variables. 

gl+m+n 1 

Since — - transforms like a l b m c n , where a, b, c may be components of any vector, and 

V 2 ( 1/r) = 0, we may impose on (a, b, c) the further restriction that it is a complex null vector. Then we 
have the following correspondence of transformation properties for 8 positive or zero: 


K‘ = 


- AW —^ \"~ $1 

\dx + l &y) \dzj r 


like 


(a+ib) 1 c n ~* like x% ’x% + ‘. 


(1) 


(-1)*#:^ = #!•_!= like (a — ib) ! c n ~* like (-1)*^*;-. 

Therefore for all 8 irrespective of sign the functions K a _ n _ 1 transform like 
Now xf x x + xf x t is invariant and so therefore is (xf x 1 + x$ x a ) 2n , that is, 



(2n) 1 

(n — 8 )!{«. + «)! 


(x* x i) n ~* (x* x t ) n+ *. 


If we write therefore 


X,= 


x{ °xf 


{(n — 8 )! (n + «) l} 1 * 


( 2 ) 

(3) 

(4) 
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Relation to Bessel functions 


24*14 


X, is invariant under our unitary transformations of x lt x t and the transformation of X, is 
also unitary. Therefore 

( n l)2 fj P&(P)P*n(P') e j s{ X’-\) 


( 6 ) 


is invariant. But if we rotate the axes so that 6’ becomes 0, 6 = # and all the ph(/i') vanish for s + 0, 
and Pniji') = 1. Then we have again 


p n ( cos#) = i> B (cos 6) jJ n (cos 6') + 2 2 


n\n\ 


1 (to — 8 )! (n + s )! 


Pn(cOS d)p" n { cos O') cos s(A' — A). 


( 6 ) 


24*14. Relation to Bessel functions. Consider the neighbourhood of a pole of a 
sphere of large radius a. The curvature of the surface is small and we should expect 
that the suitable potential functions will approximate to those useful with cylindrical 
coordinates, z corresponding to r—a, ru to a sin d, and A to A. The factor in r can be 
written approximately, for (r — a)Ja small 



( 1 ) 


with n — Ka. Neglecting the difference between 6 and sin 6 we have from 18-06(6) for 
the 6 factor 



( 2 ) 


and the solution finite at w = 0 is J s {kw). Hence the solutions r -n-1 p® (cos sA, sin sA) 
correspond to the solutions in cylindrical coordinates &~ kz J 8 (kw) (cos sA, sin sA). The usual 
n of Bessel functions, however, corresponds to the s of Legendre functions. An appreciable 
error will accumulate if 6 is large enough for the difference between sin 6 and d no longer 
to be neglected. The constant factor follows at once from 24-04(21); the first term in_p® is 


±<Jtpn s (fesinj)- 

2 ® s ! n ! s ! 


(3) 


if n is large compared with s, and this is the first term of J s (n sin 6) as it stands. Hence 
for given nd, as n^co, 

p s n (fi)->J 8 (n sin 6). (4) 


With the same approximation the factors (n \) 2 /{n — s )! {n + s )! in the formula for change 
of axes tend to 1, and a sin# tends to tcR, where R is the distance between two points 
in the plane. Hence 

00 

J 0 [k JUw*+ p 2 — 2mp) cos (A - A')}] = Jo(fcm) J 0 {icp ) + 2 J 8 (kw) J 8 (icp) cos s(A - A'), (5) 

8 = 1 


which is the addition formula for Bessel functions, originally found by Heine by this 
limiting process. We notice also that 24-04(19) becomes 

J- 8 {kw) = (-1) 8 J s {Km), (6) 


which is the relation already found for Bessel functions when s is an integer. 
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24*15. Asymptotic formulae for n large and 0 not small. In the differential 
equation for p% 

+cot 0^|-s 2 cosec 2 % + n(tt+l)y = 0 


we remove the term in y' by the substitution 

y = sin -1/2 0z. (1) 

We get — + {(n + £) 2 — (« 2 ~ i) cosec 2 0} z = 0. (2) 

When n is large and s not large asymptotic solutions are therefore simply 

y r'o sin -1/a 0 e ±<Jl+1 ^ ie , (3) 

so that the relation to J s {(n + \) 0} is much closer than that to J 8 (n sin 0) found by con¬ 
sidering only small values of 0. The constants can be found by noting that we can take 0 
so that n6 is large and nd z small, and then comparing with Stokes’s asymptotic expansion 
for J a in this range, namely (cf. 21*05 (5)) 


( 2 Vh 

J a {(n + \^ n+ 1 )fl j cos{(%+i)d-is7r-i7r}. (4) 

Hence ~ + cos ^ n + 1)°-^- l”}, (5) 

where 0 is not near 0 or n. The full expansion of this type is given by Hobson, of 
course with a different coefficient on account of his definition of the function. Un¬ 
fortunately in practice s, if not zero, is usually comparable with n, and expansions of 
Stokes’s type proceed in powers of s 2 jn and consequently are not often useful. In fact 
the approximation obviously breaks down completely when 8 — n, when the function 
has no zeros between 0 and it. 


24*16. Definite integral representations. We have 

5 { ) 2m)c(t-z)"+' M ’ 

where /(f) is analytic within G and z is within C. Hence 


( 1 ) 


pUp) = (i-a 2 ) 1/2 * 


(n — «)! d n+s 
2 n (n\) 2 dju, n+s 


(/* 2 -l) n 


(n-s)l (» + *)! (1-/Q 1/28 f (f 2 -l) n m 
2 n {n\) 2 2 m ) c{t-fi) n +*+ x ’ [) 


provided C encloses fi. Since n has so far been taken as a positive integer there is no sin¬ 
gularity at t — +1, and G can be taken as large as we like. But even if n and s are not 
integers it is easy to verify that (2) satisfies the differential equation for p 8 n {fi), and also 
holds for unrestricted fi, provided the path G is such that the integrand returns to its 
original value on describing it. 

The integral (2) is due to Schlafli, and is related to his integral for the Bessel functions. 
If we put t—fi — A, 


Pn(P) = 


{n-s)\(n+8)\(l-ii*)y*> f , , /t 2 -l\* d\ 

2 n n\nl 2m ] c [ ^ + + A / A* +1 ’ 


(3) 
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and with A = (1 — pFf^u 


Vn{P) - 


(n — 3 )! (n + s)! 1 




l\) n du 
u 8+1 ' 


2 n n ! n ! 2m 

If (1 —/j 2 ) 1/a is small, the contours fixed, and n large, this approximates to 

JLJ o e X p{i»( 


24*161 


(4) 


(5) 


which is Schlafli’s integral for J s (n am 6). A factor has been replaced by 1, so that the 

approximation assumes nd moderate and n6 z small. 

With integral n and 8 but /i complex we can take C in the i-plane to be a circle with 
centre /i and radius | (/i 2 — 1) I 1 / 8 . In general one of t = + 1 is inside C and the other out¬ 
side. Then with 

t= /i + (fi 2 -l)^e% (6) 

we have 

(n-s)\(n+8)\ (l-fi*) 1 !* r” + (/i 2 — 1 ) 1 /a e^Y{fi+\ + {p?-l )**» e^} n , 

VnKm 2 n (nl) 2 2 TT J-„ (/l 2 - lfhin+s) e (n+a}i$ a< P\ I'J 

But {{(i — 1) e -1/ai ^ + (/i 2 — 1 ) 1/a e 1/a ^} {{fi+ 1) e -1 ^ + (/£ 2 -1 f l2 e 1/2 ^} 

= {[i 2 — l) (e i( P + e- 1 #) + 2 /i(jj? — lfh = 2(/i 2 — + l) 1/a cos^}, (8) 

and p* n {ji) = + W /_ ff + ^~ 1 )V2 ° OS ^ ^ 

= - V Jo ^ + ^ 2_ 1 ) 1/acos ^} n cos dtf>. (9) 

Powers of cos0 up to the (s — l)th will give 0 on integration, and if we take (ju?— l) 1/a to 
mean i(l — /i 2 ) 1/a we must take i~ s outside. This is Laplace's integral. It will be noticed that 
the integrand always has modulus < 1 for — 1 1 and therefore | p n {fi) | < 1. This 

integral yields only one solution of the differential equation. If we reverse the sign of i 
and then put <j> = n—i/r we clearly recover the same integral except possibly for a change 
of sign. Similarly in (2), either the path C encloses (i and gives p 8 n , or does not enclose fi 
and gives 0 if n and 8 are integers. It is useless to take a path going to infinity since the 
integral diverges. Other solutions can be obtained if n is not a positive integer, but the 
specification of the paths to make the integrand single-valued becomes difficult. A full 
treatment of this case is given by Hobson. Unlike what has happened in several other 
cases, we get no fundamentally new solution by varying the path for positive integral n, s. 


24*161. Another method is to begin with 

(1 - 2h/i + A 2 )" 1/a = 2 h n p n (h< 1), 


d‘ 


dfi‘ 


8 (1 — 2 Tip 4 - A 2 ) -1/a = 2 h n ( 1 — jut)-^ Ii8 p% 


= 2 s h 8 \. § ... (s — |) (1 — 2h/i + ^ 2 )-*- 1 /a > 

n _ »2)i/a» 1,3 ••• 2s ~ l — — 2 fin-8 _ n - _ p s 

v A r ) n Ol... . J,2\s+i/a 0 \lfn> 


(1-2 h/i+h 2 ) 8 +^ 


(n — 8)l 


( 10 ) 

( 11 ) 

( 12 ) 

( 13 ) 





24161 

and 


i* = ( - !L rr^ 1 -3- (2 «- 1 ) 


Definite integrals 

(1 —^2)1/2* 


n\ 


-/I 2 ) 1 /** r 

2 7Ti ) C i 


dh 


8+1 ( 1 — 2 h/i + h 2 ) 8 ^ 1 * * 


where C surrounds the origin but neither of the points h = e ±id . 

To verify that this satisfies the differential equation, put 

p B n = A{l-ji 2 ) lk8 Q, 

where A is the constant factor, and 

1 — 2 h/i + h 2 = X; 

(1 — n 2 ) 0' — 2(8 +1) /i0' + (n— a) (n+s + 1) 0 

[ (2s +1) (2« 4- 3) (1 — pfi) 2(s + l)(2s+l)/i (n-s)(n + s + 1) 

+ 
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(14) 

(16) 

(16) 


f f ( 2 * + l)(2a 
Jo l h n ~ 8 ' 


Now 


lX s +®/a 
1 — /i 2 = X — {[i — h) 2 , 


h n ~ 8 X 8+i h 

dx 


fjn-8+l^e-^h 


dh 


= -2 (fi-h). 


J dh. (17) 
(18) 


and we proceed to eliminate [i by partial integration. We arrange (17) as follows: 

jo U 25 + 1 ^ 25 + {jp-s-~ h n ~ s ~ iX s + 6 /a) 


Now 

-u 


- 2(8 + 1 ) (25 + 1 ) ( h Sifri*k + hns-lX^i) + ^ hn-^X 8 ^ (19) 


(2a +1) (2a+ 3) (/i — h) 2 2(a +1) (2a + l)(/t — h) 


hn-s-ix^k 


J l n~ 8 Xs+ s la 


dh 


2(a+1) (/i—h) \ 

T «i VoJ_8/U I 


|~ (2a+ 1) (/i — h) ~\ -.r / ( n-8-\)(fi-h ) 1 _ 

|_ h n - 8 ~ x X 8 + z i* J ^ Vol h n ~ 8 X 8 + 3 ^ h n - 8 X s+s h ) 

_ n —(2a+ 11 f *) (Z 6- ^) ■_ 1 \dh 

- LJ (2a+i)J^ h n ~ a X 8 + 3 h + h n -*~ *X**h] 

r _ r tt+a + l“| . , w . P dh .. 6 

- ~ [] ~ _ ( n + 5+1 )( n-5 )J c ^n- S +lJ^ a+ x / 2 - ( 25 +1 )J 


dh 


dh 

- 1 X*>+ 3 la m 


( 20 ) 


The first integral remaining cancels the last term of (19); the second, taken with the 
remaining terms of (19), gives an expression containing the factor 

(2a+l)(2a + 3)-2(a+l)(2a+l)-(2a+l) = 0. 

Hence 

(l-/ t 8 )Q‘ , -2(«+l) /t 0 , +(»-a)(«+»+l)0 = - ' (21 > 

and therefore the integral will satisfy the differential equation provided that the expres¬ 
sion in [ ] returns to its original value on describing the path; and this is true not only for 
integral n and a. In particular it is true if G is any loop from infinity around one of the 
possible singularities at h — 0 and e ±i9 , provided w+a +1 > 0. If n—a is an integer (21) 
vanishes if G is a closed path about the origin, not including the other singularities, and 
the integrand will be single-valued if we break the path and complete it by two lines to 
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Bateman's integral 24*162 

infinity, so that the infinite lines of the path will cancel. Hence if we take a loop from 
infinity, passing about O, it will afford a definition of p 8 n when n and s are fractio nal , 
(w + s-f 1 > 0), reducing when they are integers to the previous definition. 

If n and s are integers the integral around a large circle will be zero, and we can replace 
0 by a path about the points h = exp (± id). It is usual to reduce the path to a circular 
arc | h | = 1 about the origin connecting the points h = exp (±id). The result is Mehler’s 
integral ; but clearly it will diverge if O i- On the other hand we could replace 8 
*>y ~ 8 > then the factor outside the integral is proportional to (1 —/t 2 ) -1 / 2 ®, which is 
not convenient in a function that behaves like (1 —/i 2 ) 1 ^ 8 when fi is near ± 1. It see ms that 
integrals of this type can be useful in practice only for 8 = 0; then it is found that 


These formulae 



are due to Mehler. 


cos (n + \)\jf 
{2(cos — cos 6)} 1/2 

sin {n + \)rjr 
{2 (cos 6 — cos i/t)} 1 !* 


df , 

(22) 

dxjr. 

(23) 


24*162. Bateman’s integral. A different type of definite integral solution is given 
by Bateman. We have, with some modifications of his method, if 


R 2 = x 2 +y 2 +(z — a) 2 , 

and a and 6 are small constants, 


(24) 


Then 


— — I f°° 

R 7TJ -a 0 x 2 +y 2 + (z-a) 2 +(w-b) 2 ' 



2 n nl T {x + iy) s {z-a + i{w-b)} n -* 
n J {x 2 + y 2 +( z -a ) 2 +{w- 6 ) 2 }”+! dw ' 


(25) 

(26) 


But 1/12 is a function of z-a and is independent of 6. Hence the operation djdb gives 0, 
and djda is equivalent to — d/dz, and if we make a and b tend to 0 we have 



(x + iy) 8 (z + iw) n ~ 8 
(x 2 + y 2 + z 2 + w 2 ) n+x W ' 


(27) 


But the left side is 


(-1 ) n n\ 


P s n e isX 

T n+l * 


and therefore 


2 n 

p 8 n = — sin 8 6 

7r 
2 n 

= — sin® 6 

7T 



(fi 4- iw/r) n ~ 8 dw 
(1 + w 2 /r 2 ) n+1 r 

(l+t 2 )^™- 


(28) 


This converges for s positive if n + s > 0 and pc is not purely imaginary; even if /4 is purely 
imaginary the point t = —i/i gives no trouble ifn>s — 1. It therefore provides a definition 
of the function in all practical circumstances. If we put A = it we get 


2» 

Pn = —sin. 8 6 


/ 


i0 ° (A +ju ,) n ~ 8 
-i» (1—A a ) n+1 


dX. 


(29) 
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If we use a different path the integral may still satisfy the differential equation; for if 
we take 


e=J 


(l+f 2 )»+i 


—fi 2 ) 0 * — 20 +1) pi®' + (n — s) (n + s +1) 0 

_ f (n - s) (n - 8 -1 ) (1 - ju 2 ) (ju + it) n ~ a ~ 2 - 2(n - 5 ) (s 4- 1 )ju (/i + it )«-*-! + (n - s) (n + s +1) (ju + it) n ~ a 

J (1-M 2 ) n+1 1 

(3 

n — 5 is a factor, and 

(tt-5-1) (1 -ju, 2 ) -2(5 + l)ju(/i + it) + (n+s +1)(/H- it) 2 
= n — s— 1 + 2 n/uit — (n + s + 1) f 2 
= n — 5 — 1 + 2int(/i + it) + 2nt 2 — (n + s + 1) t 2 
= (n — s—l)(l+t 2 ) + 2int(ju + it). 


Hence 


{{n — 5 — 1) (1 + 1 2 ) + 2 int(fi + it)} {/jl + it ) n ~ a ~ 2 .d (ju + it) 11 - 8 - 1 

(l+* 2 )*+l ~ ~ % dt~ (l + t 2 ) n ' 


(1— ju 2 )®" — 2(s + l)ju®' + (n—s) (w + 5+1)0 


9 = — > 


{fi + it) n ~ a - x ~ 

1 U + * 2 ) n 


and vanishes for any path with infinite ends ifw + 5 +l> 0 . The path chosen for p a n passes 
between the two singularities at t = ± i\ but we shall be able to obtain another solution 
by taking a path from t = ifi to infinity. The terms obtained from differentiating the 
limits vanish if 5 < n — 1. For s = n they vanish, for 5 = n — 1 they cancel the right side 
of (32). Hence (30), with termini ifi and oo, is a solution for n +5 +1 >0, n— s+l>0. 

24 * 17 . Solutions when fi is not real and between -1 and 1 . Far the greater number of 
applications of Legendre’s equation require only the solutions p’ n . The other solution may arise, pos¬ 
sibly with non-integral n, in problems relating to a spherical boundary when the poles are excluded, 
and also in external problems for spheroids, ji is real and greater than 1 for the region outside a prolate 
spheroid, purely imaginary outside an oblate spheroid, and its modulus may be arbitrarily large. 
Hence it is convenient to replace the factor (1 by (/** — l) 1/s * and to take the first solution as 

(n — a)! d n+l 

This will be real and positive for all real fi> 1. By 24-162 (29) we have also 


t'nifi) = -.(/**- l) 1/a *f 

7 n J. 


ico (A+/*)"-• 

-<co(l—A 2 ) n+1< 


which can be extended to alln,« such that n+8+ 1>0. If we put A = —/f — u and then v = — (/ 4 s — l) 1 ^ 
we get 

2" C im (—v) n -*dv 

tl(u) = — (a 2 -1 )%* -_—_ 





654 


&(/0 


2417 


As in 24*16(5), using 21*01 (16), if p is real and > 1, this tends to I g {n(p* — If 1 *} when n(p 9 — l) 1 ** is 
fixed and n-voo. The path must always be taken so as to pass between the poles of the integrand; 
these tend to 0 and + oo. 

The most convenient form for the second solution is, for n — 8 +1 and n+s +1 positive, 

•°° (u-fi)”-' 


2»+i (u -«)"-* 

—W-D*■/, 


M ^ a -1)»+ 1 ""’ (4) 

where the path does not intersect the real axis between ±1. If we take a cut in the p plane from — 1 
to +1, q‘ n (p) will be analytic and single-valued except possibly at points on the cut. If we put u—p = A, 
we get 


s:w =i </>*- i)**j" 


(S) 


If n(p* — 1)^* = x, and n -> oo, p -*• 1 + we find 

?»(/*) -> Kh_,(x) = Kh,(x) (6) 

by a similar argument to that leading to 24*16 (5), using 21-022 (60). Hence q„(p) defined by (4) is 
related to Kh # {n<y(/**— 1)} in the same way as ^ to I, and p* n to J,. 

n 

In the expression (4) put u = -—- 

2 n + 1 f 1 ( 

then qh{p) = (/** — l) 1/a> J p- n ~'- x v n ~'(l — t?) n +*|l-——| dv. 


Expand in powers of /t -a and integrate term by term; we have 

. 2 n+1 , „ ,(n —s)! °° (n + m)\ (n+s+2m)! 

q'Ju) = -(a a — 1)%* 1- - y. -- —— -1-— u~ im . 

W n W ) /* nl 2u ml #2n + i + 2m)r 


(7) 


This converges for | p a | > 1. 

It is possible to express q* n in finite terms. Take first a = 0; then t n (p) = pjji). The roots of the 
indicia! equation at /t = ±1 are both 0, and the differentia! equation has no other singularities. Hence 
any solution of Legendre’s equation has the form 


0 = At n (fi) log (/*-!) + Bt n (fi) log (/*+!) + Gt n (fi)-f n (p). 


( 8 ) 


where f n (p) is analytic at p = ± 1 and is therefore an integral function. But t n (p) and q n (p) are single¬ 
valued for | p | > 1 ; hence A = — B. If 0 = q n (p), 0 = 0(p~ n ~ x ) for | p | large, and t n (p) is a polynomial 
of degree n. Then the first two terms are together of order p n ~ x . Take C zero. Then 0 will tend to zero 
as | p | -+ co if and only if 

( 9 ) 


$»(/0 = -Bj«„(/t)log^^-/ n (^)j, 


where f„(p) must be the expansion of the first term in the brackets { } in descending powers of p, as 
far as the constant term. Evidently it is of degree n — 1. (Further the coefficients of powers from p~ x 
to p~ n must evidently be zero.) 

To determine B, return to (7) with 8 = 0, and notice that if m is large 


2 n+1 (n + m )! (n + 2m)! 
it ml (2n+l + 2m)l 


P~ 2m = ~/t- a 4— 

7 t \m % j f 


and therefore for | p | > 1 


= ^H2» + 1 + 0 U} 

(10) 

H log "-i +0< 4 

(11) 

— 1/7r, and finally 


r {wiog^_ 1 f n (p) J, 

(12) 

(n — 8)1 d * 

(13) 


also 
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This expresses q* n (p) in finite terms. It will be noticed that the differentiations will give terms in 
(p — l) - *and(/t+l) - *, while theextra variable factor needed is (/t* — !) 1 **. Hence near p — ± \>q* n (p) -+ca 
like (p 9 — l) -3 **. 

Another definite integral for q n (p) can be found as follows. We start with 


S = S h n q n (p) 


(«* — l) n+1 


= - f° 

* J, 

_ 2 J* 00 du _2 f * du 

n) ll u % —l — 2uh+2ph ~ u J n (u—h) 9 — ( 1 — 


2ph + h 9 ) 


on summation under the integral sign, which is possible for all real u >p> 1 if |A|<1; then 


S = 


tt(1 - 2 ph+h 9 ) 1 ^ ] ° e p-h-(l-2ph+h 9 )*' 


P — h + (1 — 2 ph + h 9 ) 1 ^ 


which can be regarded as the generating function of q n (p). 
On the other hand consider the series 


n-0 J-l P~V J-l(l- 


dv 


2vh + h 9 ) 1/ *(p — v)’ 


With the substitution 


n+h 

r =- 2 Li- 


dv 


1-2 vh+h 9 = v\ 
1 


(14) 

(15) 

(16) 


log 


(\-2ph+h 9 yh+vy +* 


2 ph + h 9 -v 9 (l-2ph + h 9 ) l ‘* L (1-2 ph + h 9 ) 1 ^ 


-f- v~ 1+ * 


log 


P — h + (1 — 2pJb + h 9 ) 1 ^* 


(17) 

(18) 


(1-2 ph + h 9 )* ^p-h-(l~2ph + h 9 )^ 

= ttS. 

Hence by equating coefficients of h n , q n (p) — - - n dv. 

irj-xp-v 

Note that if n is even, t n (p) is an even function, q n (p) an odd one, and conversely. 

The following are specimen values of q n . 

1, /*+l 

p p+ 1 2 

5i = - log — 

77 p— I 7T 

?« = 

77 p — 1 77 _/ 

24*171. Asymptotic approximation for n large. The series 24*17 (7) is convergent, so that the 
need for approximations of Stokes’s type for given n, a and large j p | does not arise. If we want the 
behaviour for given p and increasing n,«we can apply the method of steepest descents to 24-17 (4). Put 

<j>{u) = (» — a) log ( u—p) — n log ( u 2 — 1), 

n — 8 2 nu 


(19) 


<p'(u) = 


» = - 


u—p u 9 — 1 * 
n — 8 


2 n 4 nu 9 

+ ■ 


0 u-p) 9 u 9 -l (u 9 — 1)* 


The condition for a saddle-point gives 

(n+«)u = np + {n 9 (p 9 — l)+a 9 } lh = np+M, 
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since we want the root greater than [i for /i real and > 1; then 


/2\ Vi [n — s\ n +Va 

?«(/*) ~ j (/i* -1) % * ikf ' 


(n + «) n+ *+ 1 A 


(M — s/t)* (ilf + n/i) n+y * 
valid if w —ji at the saddle-point is several times 10'(w) [ _Va ; this is satisfied if 

n 2 — s 2 M 

-i^>0. 

n nfi + M 

It may be verified easily that if n is large but a is not, the approximation reduces to 


2VI* 


and if we put 


i tJ 1)} ' 

n(/i 2 — l) 1/a = x. 


(n+%) 


2418 

( 20 ) 

( 21 ) 

( 22 ) 

(23) 


this is approximately (2/nx) rh e~ m , which is the first term of the asymptotic expansion of Kh 4 (a:). 
For t‘ n {[i) it is found that the relevant saddle-point is where 


and we get 


w 


(n + s)u = n/i — M 

— s\ ” +1 ^ (n + s) ”+* +1 A M~y* 




(s/i + M)* (n/t — M) n+1 b’ 


(24) 


(20) and (24) are valid for all arg fi if M is defined by continuity. 

24*18. Second solution when — 1 </i < 1. We have from 21-022 (67) 

Hs,(nw) = — i e —i/aan-i lKh t (nue~ llani ), (1) 

Hi ,(n«) = i e 1/a,w< Kh,(nw e ViTri ), . (2) 

2Y t (nu) = — e~ s,ttni Kh,{nu e~ llini ) — e 1/aS?r ‘ Kh,(n« e' hni ). (3) 

We take u to be positive. 

We see from 21-02 (42), 21-022 (69) that the coefficients in the expansions of Y,(x) and Kh # (a;) are 
equal in magnitude; but as a;->0 through positive values Kh s (x) ->-oo, Y,(x) — oo. It is convenient 

therefore to take our solution as corresponding to — Y,(nu) instead of Y„(nu). We know that when n 
is large the function g£(/i) behaves like Kh,{n(/t 2 — l) 1/a }. If /t is moved clockwise about +1 so as to reach 
a point between — 1 and 1, 


2«+i f® 

£(/*)-► — (/i 2 -l) 1/a « 

71 J V, 1 -i 


(u — fi)"-' 


du 


[(U*-l) n +l 

2*»+i /*® ( u _ u,) n ~ t 

= ^ = q* n (/i — Oi) 

say. If ft moves counter-clockwise to the same point 

. f 00 («-/*)“"* 


2 n+1 

q* n {fi) -*■ - e Vii,Tr ( 1 — a*) 1 !*’ I 

7T 1 


du = q* n {[i + Oi). 


(4) 


(5) 


J 1) B ’*" 1 

Hence we can define a second solution, real in (— 1,1) and corresponding to — Y,{n( 1 — ji*) 11 *} by taking 


TT 

(n — s)\ 


n\ 


also 


?»(/*) = ?»(/* + 0i) + $e-*i* un q'J/J, - Oi). 


(7) 

( 8 ) 


If n is even, q n (fi) is an odd function when /t is not on the cut; it follows from (8) that it is also an odd 
function when defined by (6) on the cut. Similarly if n is odd, q n (/i) will be an even function on the cut. 
Hence q n (ji) on the cut is always a constant multiple of the non-terminating series solution found 
in 16-04. 






24*18 Second solution for — 1 < p < 1 

To identify the constant, notice first that, from 24-17 (12), 

«»(/* + Oi) = i log - Mrp n foi) -/ n (/t)J, 
q n (fi-Oi) = ^ j 

and therefore 


PnW log , 
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(9) 

( 10 ) 

( 11 ) 


l-/t 

ff»(/0 = ^ Jp«(/i) log ~/n(/4)J 

fjji) being the same polynomial as in 24-17 (9) and (12). Now if»is even the odd solution 16-04 (6) is 

^ _ (n— 1)(n+ 2) t , (n-3)(ra-l)(» + 2)(n + 4) K 

V a -p - - /i + /t ... 


= 2 


,2r+l 


3! 

+ 0 ( 1 ) 


2r+1 
= tanh -1 /t + 0(1). 

If n is odd the even solution, with the sign changed, is found to be 

, ,n(n+l) # (n-2)n(n+l)(n + 3) 4 , 
u «--l+ Tj A +•* 


= ~tanh _1 /i + 0(1), 


and therefore in general 


Hence 


©* = IPnW log + 0(1). 

L ~ r 
2 

«»(/*) = - ©a- 


( 12 ) 


(13) 

(14) 

(16) 


The integral 24-17 (18) can also be adapted to give a form for q n (fi) on the cut. If where /t e is 

on the cut, we can indent the path from — 1 to 1 by a small semicircle below /i Q i hence 


and similarly 


q n (fi + (ti) = - f ^^-dv, 

ITj-l'Hri'il-V 
It J -1, ft+ie fl>—V 

qjji) = -P P ^^dv. 


where P denotes the principal value. 

We find also, for h small, 

Yhn 1 J(l-2/ih + h*)+p-h 

q * W 7T*J(l — 2/ih + h 2 ) g V(l-2 jih + h*)-/i + h' 


Now if 
this leads to 

whence 


r = V(* 2 + y a + zS )» h = o/r, 2 = pur, B = *J{x 2 +y* + {z- a)*}, 

is(-Vg,w = J -iog Ji+ *~ a . 

r \r) ** w jrfi *iJ-5+o’ 

, / v 1 d n / 1 B + z — o\ 

r—= to — — 


(- 1 ) 


L’i:(ii og r±£). 

! 8z n \r & r—z/ 


r + z 

Also by comparison of the terms containing log-, using 24-04 (22) 


» ,-n_1 g£(/0 e< * A = 


(- 1 ) 


7 ml 


\8x 8y) 8z n ~* \r °r—zf 


(16) 

(17) 

(18) 

(19) 

( 20 ) 
( 21 ) 

( 22 ) 

(23) 

(24) 


jMr 


42 
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24*19. Asymptotic approximation to p*(/t) for n and a both large.* By the method of 
steepest descents we find, using 24* 162 (29), for 0</t< 1, 


PM 


1 . a (n-a\ 

V(2tt) \ n ) 


n+1 A (n + 5 )"+*+% 

(M + ji8)* (n/i — M) n ^ * 


( 1 ) 


where M — *J(s x —n* sin* 6) > 0 . 

If Af*< 0, put .ZV = ^/(n*sin 2 ^—s*)>0, arg(e/i + iiV) = a, arg(n/i — iN) = —ft; 
f2\ (n — ^)V» n — 1 /a»-t J /4(j^ 


then 


PM 


~ /1 — 


n"+ 1 A 


iV- 1 * sin {(n + i)/?-sa + £7r}. 


(2) 


24*20. Other solutions of Laplace’s equation in spherical polar coordinates. La the 
simplest possible case, n = a = 0, the solution of Legendre’s equation is 

0 = A + Blog^ = A + Blog—. 

pie term A represents that in p 0 , which is a constant. The second has branch points at /i = ± 1, and 
is a multiple of q Q (ji). We have virtually had another harmonic of degree 0 already, since <j> = A is an 
obvious solution of Laplace’s equation in spherical polar coordinates, excluded for a complete sphere 
by the condition that the solution must be a single-valued function of x, y, z. Another is log w; and 
since any functions of z+ix and z+iy are other solutions of Laplace’s equation, further harmonics of 
degree 0 are the real and imaginary parts of their logarithms, namely 

log( 2 *+aj*)% = logr+£log(l-sin*0sin a A), log^+y*)^ = log r+£ log (1 - sin* Q cos 2 A), (1) 

tan -1 (tan 6 cos A), tan -1 (tan 0 sin A). (2) 

In fact if we assume a funotion of 0 and A only as a solution of Laplace’s equation we get 



which is satisfied by the real and imaginary parts of any function of log tan \Q +tA, i.e. of tan \Q e iX . 
pus does not include solutions containing logr. The multiplicity of solutions of Laplace’s equation 
in spherical polar coordinates is therefore endless. Apart from the constant solution, however, the 
new solutions all either become infinite like logr at the centre, are infinite for some 0, A, or do not 
return to their original values when 0 or A is increased continuously by 2ir. By differentiation 
with regard t o x,y,z and multiplying by appropriate powers of r we can build up harmonics of any 
other degree, just as we built up the functions of the first kind from derivatives of 1/r. But none of 
these solutions satisfy our fundamental rule, that they must be expansible in a sphere in a triple 
Taylor series in x, y, z, and they are therefore excluded by physical principles at the outset from any 
solution intended to hold within a complete sphere or outside one. If we make the restriction that the 
solutions are to be finite and periodic in A they become limited to the solutions of Legendre’s associated 
equation. This condition arises in the problems of spheroids. For the prolate spheroid, the condition 
that 0 over a surface of constant £ is finite and continuous ensures that it is expansible in a series in 
Pn (cos (cos*A,sin*A), the coefficients depending on £. But then the condition that the terms must 
satisfy Laplace’s equation ensures that the factors in £ must be p*(cosh£) and g*(cosh£). The modi¬ 
fications for an oblate spheroid are obvious. The physical conditions therefore show that the solutions 
that we have obtained are those actually required. In practice cosh £ will be real and greater than 1 
for a prolate spheroid on the surface and at all external points, so that the singularity of q' n is excluded 
from the region considered and the q* n solution becomes admissible. For an oblate spheroid the func¬ 
tions that occur are p*(isinh £), g*(isinh£), with £ real; and isinhg cannot be ± 1, so that the g* is 
again admissible in external problems. (We have seen that it is inadmissible in internal problems for 
a different reason.) 

* Other asymptotic approximations are given by Watson, Mesa. Math. 47, 1917, 161-60* Oartib 
Phil. Soc. Mema. 22, 1918, 277-308. 
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24*21. Expansions of e~ iKR /R. In many applications of the wave equation we need 
an expansion of this function, analogous to that of 1/R in potential theory, where 
R 2 = {Xi—x'i) 2 and the expansion is to be in a series of terms in p n ( cos#) multiplied by 
the product of two functions, one of r and one of r'. The expansion can be determined as 
follows. The function satisfies 

V 2 ^ = /c 2 tfr, (1) 

and can therefore, if r < r', be expanded in a series 

- = n | 0 A n(r') —yi Kr) Pn( G °Z &)• (2) 


We write 

Then with r and r’ kept constant 


cos# = fl , 

R 2 = r a + r' 2 — 2 rr’p, 
RdR = — rr'dfi , 

0 0 


Hence 


KRd(/cR) Khr'dfi* 

(-sUt - fey’£*.«■ 


Now if m > n, d m p n {p)(dn m = 0. If m < n, 

r -v0 r W+/a 


(3) 

(4) 
( 6 ) 

(6) 

(?) 

( 8 ) 


So if we make r tend to zero the sum on the right of (7) reduces to the limit of the term 
with n = m, namely 


jim A (r 's Jm+VsM (2m )! = A m (r') (2m)! 

r -+o m ' r 1/a (xVr')" 1 2 m m ! (/cr') m 2 m +^(m + \)! 2™m l * 


( 9 ) 


But 

and hence (9) reduces to 


(2m)! = 2 2m m — /yjn, 

AJr') & 


(/cr') m (m+i)<](2iry 
Putting r = 0 on the left of (7) we have therefore 


But 

and therefore 
Hence for r<r' 


/ 0 \ m _ A m (r') /c 1/a 

\ Kr'd(fcr')) r’ ~~ {Kr') m (m + £) *J( 2 tt) ' 


*4 m (r') = -(m + i)7Ti 


.Hi^i^/cr') 


v/ 


j— 4 /cR 


= -7ri S fa + £) 


Hjn+%(*r') J n+ y a (/cr) 


p n ( cos#). 


„tV ' 2/ Vr' 

For k =s 0 this reduces to 24*05 (4). For r>r f we must interchange r and r\ 


( 10 ) 

( 11 ) 


( 12 ) 


42-2 
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24*22. The classification of multipole radiation. A familiar solution of the 
equations of propagation of electromagnetic waves corresponds to radiation from an 
oscillating dipole. Other solutions are, however, important physically, and we consider 
now how they may be classified. 

We have seen that the 2 n+ 1 quantities K s _ n _ x {s = —»,—» +1 , ...,») transform like 
the set of quantities x x ~ 8 x 2 +s when x x , x 2 undergo a unitary transformation of determinant 
unity. We shall speak of such a set of quantities as transforming according to R n and as 
giving a representation of a real rotation of order n. The three quantities ib — a = x\, 
c = x x x 2 , ib + a — x\, where a, b, c are the components of a null vector in three dimensions, 
transform according to R x . 

We denote the electromagnetic vector potential by A(A X , A 2 ,A 3 ) and the scalar potential 
by <j). Then we have in free space 


V 2 A- 


1 d 2 A 
c 2 at 2 


= 0 , 




( 1 ) 


It will be convenient to take as components of A,notA x , A 2 , A 3 but (iA 2 —A x ,A 3 , iA 2 +A x ), 
which transform according to R 1 . We call these A^ (ju, = — 1,0,1). They satisfy 


1 d 2 A 

V 2 A --—1= 0 
* c 2 dt 2 


These equations have solutions of the form 

u s n' = e iKCt f n .{Kr)p S n- e^' A = ^—^e M f n {K r ) r n '+ 1 Kt. n ._ li 


( 2 ) 


(3) 


from 24*04(22); where (Ar) 1/2 / n ,(/cr) is a Bessel function of Kr of order n' + We consider 
the case where the Bessel function is the function Hi; then 


/2\Va , , 

HW/ a (*)= - (-l)'»x n + 1 l* 


/ d y' (ie~ l 


\xdx 




(4) 


Let c® be a set of quantities that transform according to R n . Then it may be shown* that 
if and only if n' has one of the values n,n± 1, three linear combinations of c® can be 
formed which transform according to R x . We write these as 


where the coefficients 


H'f= S (8 = - 1,0,1), 

s+s'=S\S S «/ 

In n' I 1\ 

\s s' I SJ 


(5) 


can be determined. 


The three components A^ transform like TFf. We have three cases to consider. To 
distinguish them we write in turn h, a and 6 for c. 


(1) n' = n. 

(2) n’ — n+ 1. 

(3) n' = n— 1. 


?cr-i I 


( 6 ) 

(7) 

( 8 ) 


* B. L. v. d. Waerden, Die Gfruppentheoretische Methode in der Quantenmechanik, § 18. 
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The h 8 n solutions differ fundamentally from the others with respect to reflexion of the axes 
in the origin. This reverses all coordinates and is a change from a right-handed to a left- 
handed set of axes. Some vectors, for instance displacements, have all their components 
reversed under such a transformation, and are called polar vectors. Others, including 
all vector products of polar vectors, are unchanged and are called axial vectors. We 
may speak of any set of c® as an w-vector, and again we have two types. Those which are 
multiplied by (— l) w on reflexion we shall call polar w-vectors, and those multiplied by 
( — l) n+1 we shall call axial ^-vectors. Now and u£~ 8 are polar, and hence A® is an 
axial w-vector and a 8 n and 6® are polar w-vectors. 

It is convenient to define (6) as a vector potential for the electromagnetic field of 
a magnetic 2 n -pole and either (7) or (8) as a vector potential for the electromagnetic field 
of an electric 2 n -pole. 

The scalar potential 0 also satisfies 

\d 2 6 . 1M 

V ^“c2 0«2“ O; dlV ^ + c0* -O ‘ ^ 


0 is a true scalar, that is, it does not change sign on reflexion in the origin. It is of the form 



where the c s n are components of a polar w-vector. It follows that the scalar potential 
corresponding to (6) must be zero, since h s n is an axial n-vector, and that the c? n for cases 
(7) (8) are related to a 8 n , 6® respectively. 

1 dA 

From the relations H = curl A, E = — grad ^- 


the properties of E and H corresponding to the three cases can be deduced. They are 
summarized in the following tables, in which it is understood that linear combinations are 
to be formed of the quantities given: 


2 n electric pole 


2 n magnetic pole 



It will be noticed that whichever form is taken for the potentials for the 2" electric pole 
the form of E and H is the same, as it should be. In comparing the fields for magnetic and 
electric poles of the same order we see that the roles of E and H are interchanged. 
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Multipole expansions 


24:23 


We give some simple cases. 

Electric dipole, n = 1, n' = 0: 

A, = K< = «*“*(—)*— = 

* \ktt) r r 

say. In the special case d? = 1, d^ 1 = 0, 

0 e iK{cl-r) 

^ ~ ~Wz~~r~' 


For k =s 0 this reduces to the scalar potential for a dipole of unit strength with its axis 
along the z axis. 

Electric quadripole, n = 2, n' = 1: 

Suppose 6| = 0 unless s = 0. Then we obtain a solution 


Magnetic dipole, n — l, n' — 

Suppose A? 4= 0, Af 1 = 0. Then we obtain a solution 


■4* 



e Mct-r). ^ 



e Md~r). A s = <f> = 0. 


For further discussion the reader is referred to H. C. Brinkman, Diss. Utrecht, 1932,1-59. W. Heitler, 
Proc. Camb. Phil. Soc. 32, 1936, 112-26. H. A. Kramers, Physica, 10, 1943, 261-72. V. Berestetzky, 
J. Phys. U.S.S.R. 11, 1947, 85-90. B. Jeffreys, Proc. Camb. Phil. Soc. 48, 1952, 470-81. 


24*23. Multipole expansion of scalar and vector potentials. In the theory of 
electromagnetic radiation we sometimes need an expansion that takes together all terms 
containing the same power of Kr’. This can be obtained from 24-21 (12), but also directly 
in the following manner. Suppose that we have a finite distribution of charge density 
P (%i) and current density j (x^), both containing a time factor exp (i/cct), and satisfying the 
equation of continuity 


dx[ c dt 


= 0, 


( 1 ) 


that is, 




dx, 


7 + iKp - 0. 


Then the scalar and vector potentials at P(a^) are given by 

=jjjp^~dx[dx' 2 dxi A, =jjjj i e ^pdx' 1 dx' 2 dx'. 


( 2 ) 

( 3 ) 
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p and j t being regarded as functions of the position of $(#*). These satisfy the equation 


Mui d -£-o 

dx< c dt~ ’ 


(4) 


for if/is any function of x i} x' if writing dx^dx^dx^ = dr, we have 

iK ISS pfdr = -S!Sh f * = -\\\i uj < )dT+ S\S j <H dT - 

We can find a bounding surface such that,;* = 0 at every point of it; hence the first integral 
vanishes by Green’s lemma. Now if / is a function of x i — x * only, 


(5) 


and if in particular 


V 

dx { 

f(Xi,x'i) = 


V 

dx" 

e -iicR 

IT’ 


that is, 


S!S pe -r dT+ SSS j i^) dT=0 ‘ 

1 d(J> dA^ 


c dt + dx,- 


In (5) put successively / = l,x k , x' k x' m . Then if k 4=0, 

jjjpdr = 0, 

iKfjjpx' k dT = fjjjid ik dT = jjjj k dr , 
iK\jfpx' k x' m dT = JJJ ji(x' m 8 ik + X k d im ) dT = JJJ (jk%'m +jm x k) . 
Now the Taylor expansion of <f> for r' <r may be written 

( - )(t0 * operat:ons) f J r)* 


( 6 ) 

(7) 

( 8 ) 
(9) 

( 10 ) 

( 11 ) 

( 12 ) 


We put: 


p i = SS!p x i dT > p ik = JJJ P( x i x k — i$ik x j x j) dr; P = ffpx^dr. 


(13) 

(14) 


Then I* is a vector which can be expressed as a linear combination of three bf. Also 
p u = 0 and the tensor P ik has five independent components, which can be expressed as 
linear combinations of five 6|. P is a scalar. Then 

* = e-[^( i + i) + i ^(-^ + ^(i + i)(^-^)-^ + ...]. (15) 
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The term in P i gives the scalar potential for an electric dipole, that in Pile for an electric 
quadripole and that in P for an electric single pole. The corresponding expansion for A i is 


A < y> 


p—iKr 

= iK—P i 
r 


: -\ JJJtoi +kA) 71 * 

T-) dT ’ 


by (11) and (14). But by (12) the second term is 


these two terms give vector potentials for an electric quadripole and electric single pole 
respectively, corresponding to those in the scalar potential. The field (E, H) of the electric 
single pole vanishes. (In the conditions stated after (5) there cannot be an oscillating 
electric single pole.) 

In the last term of (17) we write 

= {x\j k - x’JJ dr. (19) 

This is an antisymmetrical tensor and can be related to an axial h{. The term containing 
it is a vector potential for a magnetic dipole. We have then as far as terms in (kt') 2 

A ( = ^{ip- i)—.— i)). (20) 


Prom 24*21 (12) we can write as 

fffiii: s q»,. Hi "^ ( - } (cos 0')&(cos6) (21) 

JJJ n=0 8= —n V r \ V 

where C n 8 are constants. It can be shown that the terms 

( 22 ) 

can be expressed as linear combinations of quantities c£ ±1 , e£. Sets of linear combinations 
of products of these with the p 8 n (cos 6) e* 8 * that give a polar 1-vector can be formed in the 
following ways. 

c 8 n + iPn(cosd)e i8X c£ +1 polar 2 n+1 electric pole 
iPn(c°s 6 ) e <aA c*'_! polar 2 n_1 electric pole 
CnPn(c°s d)e, ux coaxial 2 n magnetic pole. 
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The terms in (f> and A i containing 

0n e -ixr 

dx i dx k ... r * 

give rise in general to mixed multipole radiation, as we have seen from the case of 

(d 2 ^ dx k ) (e~ iKr /r). 


EXAMPLES 


1. An electric dipole of moment AT is at a distance a from the centre of a sphere of dielectric constant 
K and radius b (b< a) and the centre of the sphere is on the axis of the dipole. Prove that the force of 
attraction between the sphere and the dipole is 


R -\ • n(n+l)»(n + 2) (b\* n+1 

a * o Xn + n+1 W 


(M.T. 1939.) 


2. Obtain the gravitational potential of a thin spherical layer of matter, the surface density of 
which has axial symmetry. 

A uniform nearly spherical solid of density p has the surface r = a(l + eP t ) as its boundary. It is 
surrounded by liquid of volume $n(b* — a*) and uniform density cr. Show that, provided the solid is 
completely covered with liquid, the equation of the free surface is r = b(l + ijP t ), where 


_ 3(/> —o-) a»e 

V b z {5(p-<r)a* + 2o-b 3 y 

3. A nearly spherical conductor is bounded by the surface 


(M.T. 1936.) 


r = o{l + eP n ( cos 6)} (n > 1), 

where e is small. It is insulated and placed in a uniform field F with its axis in the direction of the 
field. Find the disturbance in the field due to the presence of the conductor, and show that the surface 
density of the induced charge at a point of the conductor is 


ZF . ZF n 

— cos u - 

4ll 4:11 2n +1 


e{(n - 2) P„_ i(cos 0) + (n+l) P^cos 0)}. 


(M.T. 1939.) 


4. A small magnet is placed at the centre of a spherical shell of iron of radii a and b and permeability 
p. Show that the field outside the shell is reduced by the presence of the iron in the ratio 




6. A condenser is formed from two conducting spheres of radii o, b (a<b) with centres at A, B, 
where AB is of length c and (c/a)* may be neglected. Find the capacity of the condenser when the 
outer sphere is earthed, and show that if Q is the charge on the inner sphere, the surface density at 

a point P of that sphere is _ . „ . _ . _ 

* Q I 3o*c cos PAB\ 

:—: \ 1-7T-=->. (Prelim. 1936.) 


417a* ( 


b* — a* /' 


6. An electric charge e is distributed uniformly along the segment BC of a straight linft OBC. 
P is any point such that OP < OB. Writing 

OP = r, OB = 6, OC — c, cos BOP = p, 

obtain an expression in terms of r, P 0 , P lt ... for the potential at P due to the charge. 
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Examples 


An earthed conducting sphere of radius a (< 6) is placed with its centre at 0. Obtain an expression 
for the potential at a point P in the region a < r < 6, and show that the charge induced on the sphere is 


— ea 


log c—log 6 
c—b 


7. If K n is a solid harmonic of degree n show that 

V a (r m K n ) = m(m + 2n+1 )r m ~*K n . 
If U is a homogeneous polynomial in x, y, z of degree 2n, show that 


JJ 


47T0 2n+ * 

UdS ■ 7^—— (V*)»t7, 


(2n+l)! 

integration being over the sphere r = a. 

8. Prove from the integral definition that 

(n +1) q n+1 (ji) ■— (2» + l)fiq n (ft) + ng n _ x (/t) = 0. 

Evaluate q 0 {/i), q\{p) by integration; and hence prove 24*17 (9) and 24*17 (18). 
f 1 Pni v ) 

9. Prove directly that J ——— is a solution of Legendre’s equation and hence that it 

multiple of q n (fi). 

10. Prove that if /* is not real and between — 1 and 1, 

(l-v*)« 


S «(/0 


= -f‘ 

2 n 1T J _! I 


dv. 


(fi-v ) n + 1 

11. Prove that if /t is not retd and between — 1 and 1, and v is in this interval, 

= i* £ (2n + l)p n (v)q n (p). 

fi—y it-* 


(►Heine.) 


12. Express sin* 5(1 + cos 6) sin 3A as a sum of tesseral harmonics. 

13. Toroidal coordinates <r, A are related to cylindrical coordinates m, z, A by the equation 

z+im = a cot\(\Jr+i<r). 
a*(d<r*+ drjr % +sinh* crdX*) 


Prove that 


ds* = 


(cosh <r — cos 

and hence express Laplace’s equation in terms of cr, ifr, A. 

Show that there exist simple toroidal harmonics of the type 

(cosh cr—cos ijr)^ cos wjr cos mA/(cosh cr), 

and determine/. 

e-*‘ = (^)^£(2n +1)( -1)" Pn(M)- 


(I.C. 1937.) 


14. Prove that 






Chapter 25 

ELLIPTIC FUNCTIONS 

4 Double, double, toil and trouble.* 

SHAKE 8PE ABE, Macbeth 

25*01. Definition: illustrations. The elliptic functions are characterized by the 
following properties. (1) They are single-valued analytic functions over the whole plane, 
except at isolated points where they have poles. (2) Two numbers <o and at exist, whose 
ratio is not real, such that for all values of z 

f(z+b)) =/(z+a>') =/(z) 

and therefore f(z + nuo + ruo') = f(z), 

for all positive and negative integral values of m and n. 

The name ‘ elliptic function* arose first from the relation of the functions to the integral 
that arises in the determination of the perimeter of an ellipse. They present themselves 
in physics in numerous ways. They have an extensive purely mathematical theory, 
possibly more extensive than any other family of transcendental functions, and con¬ 
sisting mainly of three-volume works.* Unlike most functions treated in this book they 
satisfy no linear differential equation; they satisfy non-linear differential equations of 
the first order. While most of the functions of mathematical physics, in one form or 
another, arise from the equations of waves and heat conduction, elliptic functions turn 
up in all sorts of places, usually unexpectedly. 

They occur naturally in potential problems concerning the interior of a rectangle, the 
potential being kept zero over the boundary, while there may be charges inside. The 
boundary conditions are satisfied if a suitable doubly-infinite series of images is built 
up by successive reflexions in the sides, and the total complex potential is easily seen to 
have the essential properties of an elliptic function. An important series of problems 
relating to lines of equally spaced vortices between rigid barriers has been solved in this 
way by Rosenhead,t and have applications to the resistance of solids in wind tunnels. 

25*02. Periodicity of solutions of a type of differential equation: pendulum 
and central orbits. A common mode of occurrence of periodicities is through a differ¬ 
ential equation of the form 

which is to be regarded as a first integral of a second-order equation x = where x 

is real and not zero at x = x 0 , and f(x) has simple zeros at x — a and x = b, where a<x 0 <b. 
The solution is 

f* dx 

H= iw (2) 

* Standard works are those of Tannery and Molk, Enneper, and Halphen. A recent one (in one 
volume) by E. H. Neville achieves greater symmetry in the treatment. 

f Phil. Trans. A, 228, 1929, 276-329. 
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If x is positive at t — t 0 , it remains positive till x reaches 6. A formally possible solution 
henceforward would be that x remains equal to b; for then we should have x = 0, 
/( x) —f{b) = 0.* But this solution is excluded by the condition that x — £f'(b), which is 
not zero because by hypothesis 6 is a simple zero of f(x). (An equivalent condition, if 
complex values are admitted, is that & is an analytic function of t.) Hence x must reverse 
its sign and x decreases steadily to a, where x again reverses. It follows that in the con¬ 
ditions stated a; is a periodic function of t, with a real period 


(0 — 


c 


dx 


V(/(*)}‘ 


(3) 


In one of its simplest forms this type of solution occurs in the motion of a simple pen¬ 
dulum given that at t = 0, 0 = 0 O , 6 = $ 0 . The energy equation is 

0 2 = 0 o + -y (cos 0 — cos 0 O ) = 0% + ^j (sin 2 |0 O — sin 2 |0). 


This is positive at 6 = 0 O and at 0 = 0, and has no zero for 0 < 0 < 0 Q . It is negative at 
0 = 7T provided that 



(1 + cos 0 O ), 


and then has a simple zero between 0 = 0 O and 0 = it. Similarly it has one between — 0 O 
and — 7 t. Hence 0 is a function of t with a real period. 

Again, consider motion in a central orbit under an acceleration Ar m towards the centre; 
the work function per unit mass is 


U = — 


Ar m+ 1 
m- 1-1 


(A > 0). 


The energy and angular momentum equations are 

r 2 + r 2 0 2 = f* + rldl + 2(U-U 0 ) = V* + 2(U-U 0 ), 
r*6 = rl6 0 , 

r 4 /? 2 

whence f 2 = V 2 — + 2(17 — U 0 ) — f(r). 


The signs of/(r) are seen to run as follows (/(r 0 ) = r\ > 0) 



r->0 

r = r 0 

r-»-oo 

m< —3 

+ CO 

+ 

F 2 -2 U 0 

3 <m<— 1 

— oo 

+ 

F 2 —2C7 0 

— 1 <m 

— 00 

+ 

— 00 


Hence/(r) always has a pair of zeros on opposite sides of r 0 if m > — 1, and if—3<m< — 1 
provided V 2 <2U 0 . They are easily seen to be simple. Hence in motion under a 
central acceleration r is periodic unconditionally if m > — 1 (including as an important 
special case the law of the direct distance), and if —3 <m< —1 provided V 2 < 2U 0 . The 
latter covers as a special case the inverse square law m = — 2, and the critical case V 2 = 2U 0 
is that of parabolic motion. 

* Professor D. R. Hartree informs us that this solution is actually often given by mechanical 
integrating machines unless special precautions are taken. 
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25-03. Differential equation satisfied by sn*: inversion of the integral. This 
is a powerful type of argument because it enables us to demonstrate exact periodicity in 
suitable conditions by mere inspection of the differential equation, without the need to 
solve it first. Let us apply it to the differential equation 


ldx\ 2 

(*) =( 1 -^)( 1 "^ a ) = ^ 2 . 


( 1 ) 


where k is real and less than 1, and x is assumed always to exist. When x — 0, we take 

d % x _ ldX i 
dt a 2 dx 

dx 


dxfdt = +1, and the right side has simple zeros at x = + 1. Then — = Hence 

— /7/2 9 

there is a period 


-=•/: 

say. Also, if t = 0 when x = 0, 


t 


=/: 


V((i—® 2 )(i—w)} 

dx 


= 4K, 


y/{( l—x 2 )(l — k 2 x 2 )} ’ 


( 2 ) 

(3) 


This is known as the first elliptic integral and K. as the first complete elliptic integral. 
Integrals of these forms were tabulated and much studied by Euler and Legendre, but 
the whole theory was revolutionized by a remark of Abel. Clearly if k = 0, t = sin" 1 a; 
and the study of the integrals as such is analogous to studying the properties of sin -1 a; 
instead of the much more manageable function sin®. The method initiated by Abel and 
developed by Jacobi is to write sn (t; k) for x in the upper limit of (3) and to regard the 
equation as specifying the upper limit as a function of t, which will have period 4 K 
analogous to the period 2n of the circular functions. Then sn (t; k) is to be regarded as 
a generalization of sin t. It is usual to suppress explicit statement of k when the same value 
is to be understood through the work. 

K can be expressed as a power series in k by putting x — sin0; then 


K 


Jo - 


d<f> 


-j. 


»-1 


o -^(1 — & 2 sin 2 0) 

. « /1.3.5 ... 2n— 1\ 2 7 9 \ 

^'r^iixor) 


l hn i oo I q k 9»j _ 1 

i+ s — V sin^i# 


2 n n\ 


This is a hypergeometric series with radius of convergence 1. 
Evidently 

snO = 0, sn K = 1; 

sn(— t) = — snt, 8TL(2K — t) = 8nt; 
8n(2K + t) = — sn t, 

|sn(2 K-t) = -|sn t = |sn(2 K+t). 


(4) 


(5) 


( 6 ) 


So far we have considered only real values of x and t. But if ant possesses an analytic 
continuation to complex values of t this can be taken as completing the definition of 
ant. Further, since the equalities (6) hold for all real values in certain intervals they will 
also hold for all values accessible by continuation. 
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Inversion of the integral 


25*031. We have therefore to consider whether (3) can be inverted when a? is complex, 
to give x as an analytic function of t, and whether this function will be single-valued. 
We still take Tc real and 0 < k < 1. Other values will be considered later. The differential 
equation shows that t, considered as a function of a;, is analytic in a region that does not 
enclose a branch-point of X, where 


Also 


X 2 = (l-a? 2 )(l -JbV). 

d 2 x ld/dx\*^ldX* 
dt 2 2 dx\dt) = 2 dx * 


( 1 ) 

( 2 ) 


so that the extension to complex values introduces just the extra condition that the 
differential equation is a first integral of a second order equation of the form x = |/'(a?). 
The integral for t converges when x tends to infinity, so that we must also consider its 
behaviour near the corresponding values of t (which will differ according to the path 
chosen), and there will be infinities of x considered as a function of t. 

If a is 4 -a, where a is any of ±1, ± 1 fk, then t(x) — t(a) will begin with a linear term and 
the series for t can be inverted, giving & as a single-valued analytic function of t within 
any circle about t(a) such that no point x = a lies within it and such that x does not tend 
to infinity. Near any point x — a, we have 

t—t(oc) = (a; — a) 1/a 0(a?), (3) 

where <J){x) is analytic and not zero at x = cx,. This is of a form considered in 12*052, and 
x is single-valued near t = t(<x), while dxjdt = 0 there. 

For x large on any path going to ijifinity 



" Jf± fa (1+X)> (4) 

where M is finite and \{r, x are analytic and of order 1/a?. Hence 

X = ± k{t-M) +g ®’ ^ 

where g(t) is analytic at t = M. The points t = M are therefore simple poles of x , of 
residue ± Tc. 

Hence a; is a single-valued analytic function of t over the set of values of t taken by the 
integral for all modifications of the path, with simple poles as its only singularities. 

If X 2 was a quintic or some polynomial of higher degree in a?, with no triple or higher 
zeros, a? considered as a function of t would still be analytic at values of t that make X 2 
zero. But for x large the term in 1 /a? in (4) would be replaced by one in xrr, with y > 1. 
Hence every infinity of a? would also be a branch-point, and x would not be single-valued. 
In other words we could take a? along a large arc, not closed, and arrive at the same value 
of t as we started from. 

If X 2 was a cubic we should get a term in a?"^ 3 instead of 1/a?, and a? would still be single¬ 
valued as a function of t, but with double poles instead of simple poles. The student may 
examine for himself what happens if X 2 is linear or quadratic. The result is that inversion 
of the integral makes a? a single-valued function of t, provided that X 2 is of degree not 
higher than 4. 
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It follows at once that 25*03 (6) can be extended to all accessible t, and that sn t has 
period 4JT. 

Now consider the continuation when 1 < x < 1 flc. To give X a definite sign we tak e x 
to pass 1 by a semicircle above it, so that in the new range 


Then 

say; and when x-> l/k, 
where 


J(l-X*) = -i<J(x*-l). 

/*3! d/OC 

K+ JiV((* a -i)(i-W)} = K+w ' 

t -> K + iK', 

_ C llk dx 

Ji V((^ 2 -i)(i -****)}• 


( 6 ) 

(?) 

( 8 ) 

(9) 


The quartic in the denominator is real and positive in the range, with simple zeros 
at 1, l/k. Then x is a function of v with period 2K'. Therefore by (7) it is a function 
of t with period 2 iK', and therefore 


Bn(t + 2iK') = sn t, 
for all t. Also sn (2 iK' -t) = -snt. 

Now take x purely imaginary and the path along the imaginary axis. Then if x 

dy 




V(i+y 2 )U+W 


Take 


1 +y 2 _ 

1 + k 2 y 


= z 2 . 


and take the upper limit as x — i oo. Then 


riik 

= ; 


dz 


= iK'. 


V{(3 2 -l)(l-A;222)} 

Hence iK' is a pole of x considered as a function of t. Also for large y 




x = iy = 


k(t—iK')* 

and the residue of sn t at iK' is + l/k. Since 2 iK' is a period there is another pole at 
with residue l/k. Since sn (2 K +1) = - sn t, there are poles at 2 K ± iK' with residue ■ 
K' can be put in another form; in (9) put 


= *'■=!-** 


1-& 2 


( 10 ) 

( 11 ) 

iy, 

( 12 ) 

(13) 

(14) 

(15) 

(16) 

-iK' 

■l/k. 

(17) 




dz 


■Z 2 )(l-fc' 2 z2)}> 

so that K' is the same function of k' as K is of k. 


( 18 ) 
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Now if we use only paths starting from x = 0 and such that ${x) ^ 0 for tR(x) between 
±1/Jc,t will tend to iK' when x tends to infinity in any direction. On the real axis it 
increases from 0 to K as x goes from 0 to 1; as x increases further to l/k, the real part of t 
remains constant and the imaginary part increases to K'\ as x->co, the imaginary part 
remains constant and the real part decreases to 0. As x goes from 0 to — 1, — l/k, — co 
(passing — 1 and — 1 jlc on the positive side), t goes to — K, — K + iK ', iK'. Thus t is bounded 
in the half-plane of x, and both its real and imaginary parts must take their greatest and 
least values on the boundary. Hence for all x in the upper half-plane 

-K^SR(t)^K, 0 

If we take paths confined to the lower half-plane, we get similar results except for the 
reversal of all imaginary parts. For any t satisfying these inequalities it follows that an x 
can be found, and is unique since the region contains no branch-point of t. But the integral 
provides no definition of t for | $(i) | > K' unless the path crosses the real axis where 
gft(x) > 1 or 3ft(cc) < — 1. We can however make the real and imaginary parts of t as large as 
we like by including circuits about two branch-points. In the figure the integrals about 
G and C', in the directions shown, are 4 K and 2iK'; and by including a sufficient number of 



circuits of either type before proceeding from 0 to a; we can make the real and imaginary 
parts of the integral as large as we like, positive or negative, without altering dx/dt. By 
modification of the contour we can get a simple proof of the equivalence of our expressions 
(9), (12) for K'. For L can be deformed into M without crossing a branch-point so as to pass 
to — i oo as shown. The loop around part of the real axis contributes 2 iK' as defined in (9), 
the two parts between 0 and 1 cancel, and the path from 0 to — i co makes a contribution 
equal an d opposite to the integral along L. Since the integrals along L and M must be 
equal the equivalence of (9) and (12) follows. 3 

We can take the path to include a single loop about x — +1; this will contribute 2 K, but 
X is reversed in sign when we get back to 0. By taking the integral to — x we therefore get 

j ^ 

en(2K + t) = -sn«, ^sn(2 # + $) = - —snf. (19) 

Taking it to x we get sn (2 K — t) = sn t. (20) 

By suitable combinations of paths C and C' in either sense, with or without a loop about 
+1, we can therefore attach a meaning to sn t by inversion of the integral for any value of t. 
Hence sn t as defined by continuation is a single-valued analytic function over the whole 
t plane, with simple poles of residue l/k at all points 4mK + (2 n 4-1) iK', and simple poles 
of residue -l/k at all points (4m + 2)K + (2n+l)iK', where m and n are positive or 
negative integers, and is a doubly periodic function with periods 4 K and 2iK . 
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25*04. Impossibility of three independent periods. If an analytic function/(z) 
is not constant, and has a set of periods (o x , o) 2 , ... there must be one of them with the 
smallest modulus. For if not, take any regular point z of the function; then/(z) = f(z+(o r ) 
for all r. If there is no smallest | <o r |, there are infinitely many points t = (o r with 0 as a 
limit point such that f(z+t) =f(z). Hence f(z + 1) —f(z) = 0 for all t, and therefore/(z) 
is a constant, contrary to hypothesis. Denote the a> r with least modulus by (o. If possible 
let Q be a period with the same argument as (o but not an integral multiple of (o. Let 
m be an integer such that m|w|<|D|<(m+l)|wj. Then Q.—nuo is a period with smaller 
modulus than co, and we have a contradiction. Hence all periods with the same argument 
as (o are integral multiples of (O. Let (o' be the period with smallest modulus that is not an 
integral multiple of (o. Consider the plane marked out into parallelograms whose comers 
are the points mco + no)', m and n being integers. If possible 
let of be a third period, not expressible in the form rruo+no)'. 

Then any expression of — rruo — ruo' is also a period, and we can 
choose m and n so as to make it lie within the parallelogram 
whose comers are 0, a), (o', (o+(o f . Denote the result byQ, and 
draw the diagonal AG connecting the two obtuse angles. Then q 
if Q lies within the triangle OAG, the length of OQ is less than 

the larger of | (o | and | (o' |, and therefore < | (o' |, contrary to our hypothesis that (o' is the 
period of smallest modulus that is not a multiple of (o. Similarly if Q lies within ABC, the 
length BQ is less than | (o' \ . Hence an analytic function, not a constant, cannot have 
more than two independent periods. 

This result is relevant to the inversion of an integral jdx/X, where X 2 is of higher degree than 4. 
For a contour about any two simple zeros of X 2 should define a period. For X 2 of degree < 4 it can be 
shown that the integrals about such paths are connected by relations of the form &o + e'o)'+fof = 0, 
where e, e', e" are 0 or ±1, and determine at most two independent periods. For higher degrees there 
are more independent periods, which are possible only because the function is no longer single¬ 
valued. Such functions are called Abelian functions. 

We shall call the pair of periods (o, (o' defined as in the last proposition the fundamental 
periods. Any parallelogram whose comers are at z 0 , z 0 +co, z 0 +o>', z 0 +(o+(o' will be called 
a fundamental parallelogram. In general z 0 will be taken to be such that there is no pole 
on any side. The greater part of the theory of elliptic functions rests on two simple 
theorems. 

25*05. The integral of an elliptic function about a fundamental parallelogram is zero. 
For if ABCD is a fundamental parallelogram the integrals along opposite sides AB, DC, 
are equal and opposite on account of the property of periodicity and the fact that the 
sides are traversed in opposite directions; similarly those along BG, DA cancel. 

An elliptic function with no singularities in a fundamental parallelogram is a constant. 
For if it is bounded in a fundamental parallelogram it is bounded over the whole plane on 
account of the periodicity; and therefore it is a constant by Liouville’s theorem. 

If/(z) is an elliptic function, f'(z) ff(z) is another. Applying the first theorem to f(z), we 
see that the sum of the residues at all poles in a fundamental parallelogram is 0. Applying it 
to f'(z) f (z), we see that the number of poles off(z) is equal to the number of zeros, multiple 
poles and zeros being taken multiply. Applying it to/'(z)/{/(z) — c}, where c is any constant, 
the same holds. But the poles of f(z) — c are those of f(z); henc ef(z) — c has the same number 
of zeros in a fundamental parallelogram asf(z), whatever c may be. 
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25*06 


cn t, dn t, sc t, etc. 


Now take round a fundamental parallelogram 



z/'(z) 

/(*) 


dz. 


If the poles of/(z) are a r , the zeros /? r , the integral is equal to 2/? r — Sa r! multiple poles 
and zeros being taken multiply. But 


+ J Jrv dz = -w = 2 " r *P w » 

LJ*. Jz.+W+^J /(«) /(«) 


wherep' is an integer, and similarly the integrals along the other two sides give a multiple 
of 2mo). Hence the sums of the values of z at the zeros and poles of f(z) in a fundamental 
parallelogram differ by an expression of the form po)+p'(t)', where p and p' are integers 
(possibly zero). 

25*06. Other Jacobian elliptic functions : cn, dn, etc. The function sn z has poles 
in every fundamental parallelogram. This is surprising in a generalization of sinz, but we 
recall that sinz has an essential singularity at infinity. When k ->0 the imaginary period 
of sn z tends to infini ty, and the poles merge into an essential singularity at infinity. 

There are four poles in a parallelogram of sides 4 K, 4 iK'\ hence the function takes 
every other value four times in such a parallelogram. 

Associated with snZ there are functions corresponding to the other trigonometric 
functions. The first two are defined by 


cnt = V(l — sn 2 $), cnO = 1, 

dnt = — k 2 sn 2 t), dn0=l. 


The zeros and poles of 1 - sn 2 1 — A; 2 sn 2 1 are double when these functions are considered 
as functions of t; hence cnt and dnt are single-valued when continued analytically. The 
other functions are defined by division: 


sc t = sn t /cn t, 
sd< = ant/dnt, 
cd£ = cnt/dnt, 
ns t = 1 /sn t, 
ndt — 1/dn f, 


cat = cntjant , 
ds£ = dntjant, 
dc t = dn t /cn t y 
nc t = 1 /cn t. 


Then the fundamental differential equation can be written 



sn t = cn t dn t. 
dt 

(1) 

Hence 

7 7 

-pen 2 t = — ^-sn 2 £ = — 2sn£cnfdn£, 
dt dt 



d 

-j-cnt = — ant dnt, 
dt 

(2) 

and similarly 

d 

-j- dn t = — k 2 sn t cn t. 
dt 

( 3 ) 

Since 

snK = 1, cniT = 0, dn K = k r \ 
sn (K + iK r ) = 1 jk, cn (K+iK') — —ik'/lc, dn (K + iK') = 0. 

( 4 ) 

( 5 ) 
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To reach values of t between K and K + iK', x must pass +1 on the upper side; then 
%/(1 — x 2 ) becomes —ij(x 2 —l) and cut is negative imaginary in this range. When k = 0, 
cnf reduces to cosf, dnf to 1, set to tanf, and so on. The functions are also reducible to 
elementary functions in the other extreme case k = 1, when 

sn t = tanh t, cn t — dn t = sech t. 

25*07. Differentiation. By direct transformation we find the following differential 
equations satisfied by the functions: 


y = cnf 

il 

-V{(1 -y a )(fc' a +% 2 )}, 

y = nef 

II 

V{y-i)<4v+^)}. 

y = dnf 

dy = 
dt 

-V«1 

y = ndf 

dy _ 
dt 


y — set 

dy = 
dt 

V{(i+s/ 2 )(i+t'y)}> 

y = nsf 

dy _ 
dt 

-Mf-W-v)). 

y = csf 

dy _ 
dt 

-V«» 2 + 

y = cdf 

dy __ 
dt 


y = sdf 

dy = 
dt 


y — def 

dy _ 
dt 


y — dsf 

dyj = 
dt 

-V{y-fc'w+£ 2 )}- 





The roots are to be taken positive for O^t^K. 

We notice that the functions ns t, cs t, ds t have poles of residue 1 at the origin and that 
the signs of the constants in the roots are respectively both negative, both positive, and 
different. E. H. Neville finds it convenient to take these as the fundamental functions. 

25*08. Residues at poles. We have seen that all poles of sn t are simple poles of 
residue ± 1 /k; hence all those of cn f are simple with residue + i/k, and of dn f with residue 
± i. When x = snf-> + oo, passing +1 and +1 /k on the positive side, t -» iK', cn t behaves 
like — i sn f, dn £ like — ik sn f. But the residue of sn t at iK' is 1/k; therefore those of cnf 
and dnt are —ijk and —i. Since cnf and dnf are even functions their residues at —iK' 
are i/k and i\ for 

A A 2 A/S 

t-/3 f + /?~"f 2 -/? 2 ’ 

which is an even function of t . 

Apart from a sign, cd t and sn t satisfy the same differential equation. But if K < t < 2 K, 

snf is a decreasing function. Since the equation is of the first order and does not contain 

f explicitly, it follows that 

r sn (f + y) = cdf, 

where y is some constant. With y = K this holds for t = 0, and therefore universally. 

Hence . , 

sn(f + K) = enf/dnf. (1) 

By transformation, attending to the signs of the functions between K and 2 K, we get 

cn (f + K) — — k’ sn f/dn f, dn (t + K) = k'ldn t. (2) 

It follows that 

sn (f + 2K) = —snf, cn (f + 2K) = - cnf, dn(f + 2A) = dnf. (3) 
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Expressions in partial fractions 25*09 

Hence sn t and cn t have period 4Z; but dn t has period 2 K. This permits us to say that the 
residues of cn t at 2K + iK' and 2 K - iK' are + i/h and - i/k; those of dn t are - i and + i. 
We therefore have the following set of residues. 



iK' 

-iK' 

2 K + iK' 

2 K-iK' 

sn t 

Ilk 

Ijk 

-1/k 

-1 /k 

cn t 

— ifk 

+ i/k 

+ i/k 

-i/k 

dnt 

— i 

+ i 

— i 

+i 


If we put ns t = kz , z also satisfies the differential equation for sn t except for a sign. 
Hence there is a y such that 

sn (t + y) = + ————. 
v n “&sn£ 


To make the poles correspond we take y = iK', and taking t small we fix the sign as positive 
since the residue of sn£ is 4- 1/k at t = iK'. Hence 

sn (t + iK') = sn (t + 2iK') — sn t. (4) 

Further, fixing the constants suitably, we get 

( 6 ) 

i /, .mv idXit 

dn (2 + tZ ) =--, (6) 

sn£ v ' 

and therefore cn (t + 2iK') = —cut, dn(t + 2iK') = — dnZ. (7) 

Also 

sn(t + K+iK')=^- on (t+K+iK’) = dn (t + K + iK’) ~ iV — . (8) 
ic cu v fc cn z cn ^ 

8n(t+2K + 2iK') — — snf, cn(t + 2K+ 2iK') = cat, dn.(t+2K+2iK') = — dntf. (9) 

Each of the functions is therefore periodic in a parallelogram whose area is half that of the 
parallelogram of sides 4 K and 4 iK'. All are periodic in the latter parallelogram. It is 
usually convenient to use the parallelogram of sides 4 K and 4 iK' for this reason. 

25*09. Partial fraction and trigonometric expansions. One of the most prolific 
sources of formulae in elliptic functions is the comparison of the principal parts at poles. 
The method is illustrated most simply by the functions ns t, cs t, ds t. These are all odd 
functions with residue 1 at t = 0. The effects of adding 2 K, 2iK', 2K + 2 iK' to t are shown 
in the following table of signs. 


sn 

2 K 

2 iK' 

+ 

2K + 2iK' 

cn 

— 

— 

+ 

dn 

+ 

- 

— 

ns 

— 

+ 

— 

cs 

+ 

— 

— 

ds 

- 

— 

+ 
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Hence by Mittag-Leffler’s theorem (the functions being bounded on a suitably chosen 
set of parallelograms tending to infinity) 


00 00 / J 


(4m + 2) K — 4niK' 

1 


SS (*-4mZ-4m J g: , + *- 


" r f — 4mA — (4 ti + 2) iK' t — (4m + 2) K — (4% + 2) iK' 

_1_ 

(4m +2 )K— 4niK' 

_1_ _1_ 

t — 4mK — (4w + 2) iK' t — (4m + 2)K— (4 n + 2) %K' 


ds£ = 22 


/_L 

\t — 4 mK - 


IniK' t — (4m + 2) K — 4 niK f 

1 


$ — 4mZ-(4w + 2) iA' t— (4m+ 2) A— (4w+ 2) iA' 


The integrals 25*03(2), 25*031(9) or (14) would define K and K' if Jc was complex 
(A haying singularities at h = ± 1 and K' at k = 0), and the series just given will still con¬ 
verge and be analytic functions both of t and of Jc. Hence they lead to definitions of the 
elliptic functions, satisfying the same differential equations, even if k is complex, and 
we can now remove the restriction that Jc is real. 


and therefore 


7 t r 
“ 2A[_ 


7Tt 

00360 21 + 


7T ,7 TV 1 / 1 1\ 

2K cot 2K~v + T ‘ {v-ZnK + Z^K)’ 

7T 7 TV 1 w/ 1 1 \ 

r-*? cosec —pp — —h2 ( —l) n —-—— + -—- , 

2 K 2K v v ' \v-2nK 2nK)’ 

g? + 21 cosec ^ (2 — 2niK') + cosec ^ (t + 2niK')\ 


. . nt . mrK' 
^ 4sm^oo S h-g- 

00360 2/f + ? 2 nnK’ nt 

cosh——-cos — 

A A 


TT )_ + 7rt , ® A • K I 

° s 21 2£ + i l , 4nnK' nt o ,(4»+2)7rX' jrf|’ 

^ cos h_^-oos z cosh-g-cos^J 

I . . nt . mrK'' 

nt lw 4sin 22 oosIl rK- 

° o Sec _ + S(- 1 )..— ^ -- . 

cosh—^-cos-^ 


The series are rapidly convergent if A'/A is not very small. A physical illustration is 
given by a lattice of charged wires arranged regularly in planes, in such a way that every 
set of four taken at the comers of a cell contains two positively and two negatively charged 
wires. In such conditions the potential at a point is determined mainly by the charges in 
the neighbouring planes, the contributions from more distant charges nearly cancelling. 
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25*10. The addition formulae. If v is constant, sn (u + v) is an elliptic function 
of u , with the same periods as sn u. For k — 0 and k = 1 it can be expressed in terms of 
functions of u and v, as follows: 

sin(w + i>) = sinw cosv + cos« sinv, 

. . . tanh u 4- tanh v 

tanh(u + v) 

There is no obvious similarity between these formulae to suggest an analogous expression 
for general k. But for small v and any k 

sn(w + v) = snM + cnttdntt snv+ 0(v 2 ) 

— snu cnv dnt; + cnw dnu sn v + 0(u 2 v 2 ), 

by symmetry. For k = 0 the error term vanishes. For k = 1 the right side is 

tanh u sech 2 v + tanh v sech 2 u + 0(u 2 v 2 ) 

= (tanh u + tanh v) (1 — tanh u tanh v) + 0(u 2 v 2 ) 

— (1 — tanh 2 u tanh 2 v) tanh (u + v) + 0(u 2 v 2 ). 

Thus both extreme cases are included in the formula 


an (u + v) = 


sni« cn v dnv + cnu dn u sn v 
1 —f(k) sn 2 u sn 2 v 


where /(0) = 0,/(l) = 1; and the formula is right to 0(u 2 v 2 ) for any k. 

Now sn (u + v), considered as a function of u, has a pole at iK' — v. But the numerator 
is finite unless u or v itself is- of the form iK' + 2 mK + 2 niK'; hence the pole can arise only 


from the vanishing of the denominator. But 


sn (iK' — v) — — 


1 

kanv’ 


and therefore this condition is satisfied for all v if, and only if, 


and then 


f(k) = k 2 , 


sn(w + i;) = 


snwcnvdnv + cnwdnwsnw 
1 — k 2 sn 2 u sn 2 v 


( 1 ) 


If there is any formula expressing sn (u + v) in terms of elliptic functions of u and v, it 
must therefore be (1). It remains to show that (1) is true. 

The easiest method is by comparison of residues. The left side has poles of residue 1/k 
at u = -v±iK', and of residue - 1/k at u = - v + 2K + iK'. Adding 2 K to u reverses both 
sides of (1), and both sides have period 2iK' . We need therefore consider only u = - v + iK'. 

At any pole of snw the right side is analytic. Hence poles of the right side can arise 
only from the v anishin g of the denominator, and will be simple. The denominator vanishes 
if u = —v± iK’, but also if u = v ± iK' . We must therefore also consider u = v + iK'. But 


sn (v + iK') cni)dn«; + snt;cn(!) + iK') dn (v + iK') 

i 


kanv 


cnvdnv + sn 


( i dn l 
snfl/ \ 


cn«? 
sn vj 


) = 0 , 
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and therefore v 4- iK' is not a pole of the right side. Hence the left and right sides have the 
same poles. Again, if u+v — iK' is small, say 0, 


sn 2 ( iK' — v 4- 0) 


1 _ 1 

k 2 sn 2 (v — 6) k z sn 2 v 


2cnvdnv 
k 2 sn 3 v 


0 + 0 ( 0 2 ), 


and the right of (1) is 


2cn«dnv /2cnvdnv 
k sn v j sn v 



Hence both functions have the same poles and the same residues, and can therefore differ 
only by a constant, which we identify as 0 by taking u = 0. 

Alternatively, knowing that both sides of (1) have the same poles and zeros we infer 
that their ratio is a constant; and we show that this constant is 1 by taking u small . 

The corresponding formulae for cn (u 4- v) and dn (u 4- v) are 


cn(«4-fl) = 


cn u cn v — sn u sn v dn u dn v 
1 — & 2 sn 2 u sn 2 v 


(2) 


dn (u 4- v) = 


dn u dni; — k 2 sn u sn v cn u cn v 
1 — k 2 sn 2 u sn 2 v 


(3) 


These are easily verified by direct transformation. 

We have thought it interesting to show how these formulae might have been discovered 
by study of extreme cases. They were actually found in a totally different way. Euler 
found a complicated identity connecting elliptic integrals, which became (1) when trans¬ 
lated into Jacobi’s notation. 

Another method of verification, quite straightforward but best suited for a long spell 
in a railway waiting room, is to differentiate the right sides and show that the derivatives 
of each with regard to u and v are equal. They are therefore functions of u + v, and are 
identified by taking u = 0. 

25*11. Infinite products for sn t, cn t, dn t. The function d (log sn t)/dt has simple 
poles of residue 4-1 at all zeros of sn t, and of residue -1 at all poles of sn t. Mittag-Leffler’s 
theorem therefore shows that 


d. , 1 

glogml-j- 




'( _L 

\t-2niK' 


r + : 


)■ 


-2 mK 2niK' + 2mK t-(2n+ l)iK'-2mK (2n+ l)iK' + 2mK 

m = 0, n = 0 being excluded from the first two terms. If we take equal numbers of 
positive and negative values of m and n the constant terms will cancel. Hence 


d 

dt 


iogsnt = ^{ cot ^ + s 'cot 


HL J .?! s eot ff{<-(2»+!)♦*'} ) 


2 K 


2 K 


j’ 


snf = A sin 


= A sin 


7 Tt 


®, . 7r(t—2niK') 
sm - 1 1 


2K )ig> 


00 / 


7Tt 


II (cosh 


1 \ 


2 K 

2mrK' 
K 


— cos 


K, 


2 *n( 


, (2n— 1 )ttK' 7rt 
cosh- ^ - 005 K 


r 
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Theta functions 


25*111 


Put %K' = Kt\ q — e i7TT = e~” K>IK . We assume 0?( K'jK) > 0 . Then | q | < 1 , and 


TTt 


!- 2 « 2 “co 8J + J 


An 


1 — 2q 2n ~ 1 cos — + g 4w-2 
K. 


TTt 


2K. TTt £(1— q 2 ”- 1 ) 2 1 2?2WCOS Z + g4n 

” 7T Sm 2iT 1 ? (1-rt 2 . _ . 7T* 


1 — 2q 2n ~ 1 cos -=■ + q in ~~ 2 

A 

the constant being adjusted to give the correct derivative at t = 0 . The products are all 
absolutely convergent since | q | < 1 . Similarly 


00 00 / 

glogmJ-S 


1 


(2m +1) K — 2niK' t — (2n + 1) iK'— 2mic} 

)• 


- - £ j | ■tan ^t^El + £ cot^ " (2M +1 > iK '^ 


2 K 


2 K 


TTt 


whence 


Also 


(1 — q 2n - x ) 2 (\ J r 2q 2n cos-= + q /in ) 

.TTt ® K 

Cnt = C0S 2^ n - TTt -■ 

1 ( 1 + q 2n ) 2 ( 1 — 2q 2n ~ 1 cos -= 4- g 4n_2 ) 


dn^ = n 


7rt 

(1 — q 2n ~ x ) 2 ( 14 - 2 g 2w_1 cos -= + q in ~ 2 ) 

iL 

— — yr/ 

1 (1 + g 2n-1 ) 2 (1 — 2g 2w_1 cos — + g 4n_2 ) 

i£ 


25*111. # functions. These formulae express the elliptic functions as ratios of four 
integral functions all possessing the period 4 K. These functions can also be expressed as 
series. It is convenient to put ^ 

z = ^> (!) 


so that the period becomes 2 tt. Then any of the four integral functions, for z with constant 
imaginary part, can be expressed as a Fourier series, which can be extended to other values 
of the imaginary part by continuation (if it remains convergent). Take first q = e nir , 

f>(z) — II ( 1 “ 2q 2n ~ x cos 2 z + q* n ~ 2 ) 

l 

_ pj (1 _g2n-l e 2iz) (1 _ £271-10-213^ 

1 


Then 


Also if 


<f>(z + TTT) = n (1 - q 2n +* e 2iz ) (1 - q 2 n—Zg-2iz) 

1 

1 _fl-l p—2iz 

= i_ qe «. ?H Z ) = 

<fi(z) = 4 0 +2u4 2 cos2z + 2^4 4 cos42+... 

= A 0 + A 2 (e 2iz + e~ 2iz ) + A 4 (e iiz + e~* iz ) + ..., 

<f)(z + TTT) = A 0 + A 2 (q 2 e 2iz + q- 2 e- 2iz ) + A i (q i e i ' iss + q- i e- Mz ) +... 
= -g- 1 e- 2iz {4 0 + A 2 (e 2iz +e~ 2iz ) + ...}. 


( 2 ) 


and also 
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On equating coefficients we find 

<f>(z) = A 0 {1 + 2 £ (-1)- q^ cos 2 nz], (3) 

l 

which we define as A 0 x} 0 (z). (4) 

Then by substituting in turn the values z + \ti, z + z + \tt + \tit for z we find 

<f>(z+ In) = fl (1 + cos 2 z+q in ~ 2 ), (5) 

i 

4>(z + \ttt) — 2 ie~ iz sin z fl (1 — 2g 2n cos 2 z + q* n ), (6) 

i 

(f){z + \n + \m) = 2 e~ iz cos z n (1 + 2 q Zn cos 2 z + q in ), (7) 

l 


Three functions -^(z), # 2 (z), ^(z) are defined in terms of $0(2) and expressed in series 
as follows: 

# 0 (z + i 17 ) — #3 (z), #3(2) = 1 + 2 2 2 n * cos 2wz. (B) 

1 

#0 {Z + Ittt) = iq~ lli e~ 4s d' 1 (z), ^(z) = 2 £ (- 1)» sin (2n +1) z, (9) 

n= 0 

# 0 (z + 4 - |ttt) = q~ ll *e~ is d‘ i (z) t # 2 (z) = 2^ g (n+1/2) * cos (2w +1) z, (10) 

»-= 0 


The four functions # 0 , # 2 , # 3 are Jacobi’s theta-functions.* They are seen to be directly 

related to the four infinite products that we have found in the expression of sn, cn, dn. 
In fact if we put l/^4 0 = O, 


00 


#0=611(1- 2 q in ~ 1 cos 2z + q in ~ 2 ), 

1 

(11) 

= 2 sin z n (1 — 2g 2n cos 2z -f q in ), 

1 

(12) 

00 

1 d - 2 = 2 Gq r,i cos z If (1 + 2 q 2n cos 2z -1- q in ), 

1 

(13) 

#3 = 0 n (1 + H 2n ~ x cos 2z + q* n ~ 2 ), 

1 

(14) 

B JKz_K Mz) 

n n 1 &,(z) (1 ’ 

(15) 

2 Kz 1 ,,.^(z) n (l# o (0 )#,(*) 

7T ^ 0 (Z) (l+tf 2 ") 2 #,(0)^(2)* 

(16) 

2Zz ^ 3 (z) n (l-^- 1 ) 2 ^ 0 (°)^3(2) 

W #.(*) (l+? a “- l ) a #,(0)W 

(17) 


It is readily verified that all the # functions satisfy the equation of heat conduction in 
the form 

r)‘ 5 ->> 4.1 rW* 

(18) 


3 2 # 

3z 2 


4 i 3 # 
7 T 3 r’ 


* Fundamenta Nova, 1829. Whittaker and Watson denote by # 4 the function here called $ 0 . 
Their d'^nz) is our # r (z). 
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Since cnO = dnO = 1, 


Theta functions 


25*111 


#■(0) _ #,(0) _ # 0 (0) 

2^/4 n (i+$*»)* n ( i+ g 2 "- 1 ) 2 n ( i - q* n ~ i) 2 ~ 


When 3 = dn 2Kzjn = k'\ hence 


V&' = 


»o(0) 


Also when 2 = + cn 2Kzjn = -ik'/k; then 


y 


P *o(0) 
k #,(0)’ 


whence = 


(19) 

( 20 ) 


( 21 ) 


and 


sn 


21£z 1 ^(z) 


TT 


<Jk& 0 (z)’ 

by choosing the constant factor so that sn 2 Kz/tt = 1/k when z = \tt + \ttt. Hence also 


( 22 ) 



2 Kz ik Vh&Jz) . 2Kz in ,-d-Jz) 

cn - = T dn - = k'ik-J*Ll 

TT \kj # 0 (z) TT ^ 

(23) 

2Kz 

Now we can write (22) as #i(z) = Jk&Jz) sn - , 

TT 

(24) 

whence 

a// \ n f 0// % 2i£z 2iT a 2ATz. 2jsTz) 

#i(z) = V&|#o(2) sn ^ ^ # 0 ( z ) cn ^ dn ^ j, 

(25) 

and 

#r<o) = - (^) 3# » ( °) t 1 +j > 

<(0) 3#;(0) 

^(0) = #o(0) U) (1+4) - 

(26) 

But from (23) 

*S(0) ^(0) /2JT\ 2 0$(O) ^(0) (2Ky 

*,(0) ^o(0) IW’ #,(0)"#,(0) U/’ 

(27) 

whence 

#T(0) #5(0) #5(0) 0fto) 

^i(0) ^o(0) #,(<>) #,(<>)• 

(28) 


But by the partial differential equation (18) this is equivalent to 


^log^(O) = ^ log [# 0 (0) ^(0)^(0)], 
and #1(0) = e» 0 (0)#,(0)#,(0). 


where c is independent of r and z. Taking the lowest powers of q in each series we find that 

c=l, 


whence 

Now 


W=# 0 (0)#*(0)#a(0). 

id 2 Kz\ 2 K 

U ?n TT Lo * ’ 


whence from (13), (14) and (15) 


#i(0) 


( 29 ) 


n (1 — g 2n ) ! 






25*111 Imaginary transformation 

The infinite product in the denominator is equal to 1; for 
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II (1 +<f n ) (1 + (1 = n (1 +(f) (1 -q*”- 1 ) 

= n (l -q^jl-q 2 ”- 1 ) 

1 -q n 


= 1. (31) 

Hence £ = il(l-2 2n )* ( 32 ) 

i 

Also from (22) ^ = =-L# 2 ( 0 )^(0) = #1(0), (33) 

and iK' = tK = ^t^(O). (34) 


The & functions have one period equal to tt or 2tt, which is an advantage in problems 
relating to fixed boundaries. The series are extremely rapidly convergent. If K'/K = 1, 
q _ e -n } anc [ qt ig practically always negligible. (We can always arrange that K'/K ^ 1; 
for we see from the differential equations that if we put t = iu the differential equation 
for sc t is converted into that for sn t with modulus k'. This is known as Jacobi’s imaginary 
transformation.) On the other hand k is often among the data, and both periods are to 
be found. The sn, cn, dn notation is then more convenient. 

The transformation for the functions is closely related to an identity that we have 
had in relation to heat conduction. From 20*02 (7), (12) 


x 

l + 


® 2 . nnx 

2 — sm —j—exp) 

% =s» 1 VbTT L 




for all positive t and 0 < x < l. By continuation it can be extended to all x. Differentiate 
with regard to x and multiply by l. Then 


, ^ ® U7TX 

1 + 2^ cos —=-exp 
n = 1 t 


(- 


nhr 2 liH\ 


l 2 


_ I — _I 


(l 


00 yJ'T 

+ 2 2 e~ r2ltim cosh To- 

r-i hH 


)• 


Put 

then 


Hence 


7TX 

T 


= 22 , 


cr — 


irhH 



oo / oo 2 Tz\ 

1+2 2 cos 2 nz exp (— 7771 V) = cr- 11 * e~ a * /7ro - (1 + 2 2 e _wr,/<r cosh— ) , 
»=1 \ r-l a / 

#3(2: cr) = . 


The transformation is usually stated in terms of r, but is more convenient in terms of cr, 
which is real in most applications. The notation adopted makes the use of cr explicit. 
If cr is small the terms on the left diminish slowly at first, those on the right rapidly so 
long as 91(z) < 7 r. The form of the expression on the right does not make the periodicity 
in 9l(z) explicit, but for larger values of 91(2) the values of the function can be inferred 
from the periodicity. 
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Elliptic function in terms of & functions 25*112-25*113 

By using the relations (8), (9), (10), we derive the corresponding transformations of 
the other d- functions. 

& 0 (z : <t) = a --1 / 2 e- 2 * /W # 2 ^ : - j, 

# 2 (z: cr) = o‘~ 1 ^e~ ziln<T d' Q j, 

# x (z : cr) = - j. 

25*112. Change of argument. When the argument of a # function is increased by 
, ^7, or |7r + \ht the result is another ft function multiphed by a simple factor; the 
following table gives the results, several of which have already been used. We write 

M = q- lk e~ iz , M' = q~^M 2 = q^e- 2 ™, 


z 

Z + %7T 

z + \ttt 

Z + %7T+%7TT 

Z + 7T 

Z + 7TT 

Z + 7T + 7TT 


*3 

iM-& 1 

m $ 2 

* 0 

-M% 

-M'& 0 



iM& 0 

Md- 3 


— M'&i 

M'&i 



Md- 3 

-iM& o 

-»3 

M'$ 2 

-M'& 2 

*3 


m & 2 

iM&! 

^3 

M'd- Z 

M'$ s 


25*113. Expression of elliptic function with given periods in terms of # func¬ 
tions. Since 

= — -0' 1 {z), •9' 1 (z + 7T7) = — g' _1 e _2fe ^- 1 (z), 

&'i(z + 7 r) = ft£(z) &j{z + irr) = &j(z) 

'& 1 (Z + 7T) $i(z) * ^(Z + TTT) + ^i(z)' 

Hence the function ‘ 9 , ' 1 (z)[-d' 1 (z) has period tt. It has simple poles of residue 1 at z = 0, nr, 
7T + 7TT, ... and its derivative has also the period nr. 

This property can be used to express any elliptic function 6{u) of periods <o and (o’ 
in terms of Put 

<f>(u) = / (^j = f(z), <o'/a) = r, 


where (o', (o are taken so that ^(co'/(o) > 0. 

Then/(z) has the periods tt, ttt. Let the poles of f(z) in a fundamental parallelogram be 
simple poles at a 1 , a 2 , ... of residues ^4 l5 ^4 2 , .... Then 


F(z) = ’ZA i 


#1 (z-CCi) 


differs from/(z) by at most an integral function. Further, 


F{z+tt) = F(z), 


F(z + 7TT) = Hi A 


Wliz-CCj) 

&l(z-CCi) 


— 2 i £ A i 


= F(z), 


since £.4* is the sum of the residues of/(z) in a fundamental parallelogram and therefore 
is zero. Hence 


/(z) — F(z) = constant. 





25*12 Elliptic integrals: standard forms 685 

It follows that J f(z) dz = 'LA i log & x (z — a t -) + Cz, 

so that any elliptic function with only simple poles can be integrated in terms of 
This can be extended at once to multiple poles by making use of the derivatives of 
Again, if/(z) has n poles a it and n zeros fi i we can take 


G(z) = 


n&i(z-fii) 

n&^z-oci)’ 


multiple poles and zeros being repeated in the products. Then 

G(z + 7T) = G(z), 

since the numbers of poles and zeros are equal; and 


G(z + nr) = 


exp (2i 2 fii) 


G(z). 


exp (2i 2 a { ) 

But 2 — 2 a* is of the form pn + qnr, where p and q are integers; and this can be made 

zero by a suitable choice of the /?* and a i} if necessary going outside the original parallelo¬ 
gram. Then f(z)/G(z) is a constant. 

25*12. Reduction of elliptic integrals to the standard form. Let R(x) be a 
quartic with real coefficients and no repeated factor. If we make a bilinear transformation 


x = 


at + fi 
yt + 8 * 


( 1 ) 


R{x) takes the form R x (t)/(yt + 8)* y R x {t) being another quartic, and dxjdt is proportional 
to [yt + 8)~ 2 . Hence fdx/JR(x) is transformed into the form jdt/*jR x (t). We want to choose 
the transformation so that R x {t) will be an even function. Let the roots of R(x) be a, b, c, d, 
those of R x {t) be —h, —g, g, h. Try 

JU - U - (/ -T- IJ 

(2) 


x—b _ j t+g 
x — c t — g' 


Then 6 corresponds to — g and c to +g. For the other roots to correspond we must have 

d- 


a — b _ l -h + g _ Ji-g 


b = jh + g 


whence 


a — c —h — g g + h’ d — c h — 

lg + h,y __ d — ba — c 
\g — h) d — ca — b* 


(3) 


(4) 


(5) 


the cross ratio of the roots d, a with respect to 6, c; also 

7 a-bg+h 

If — 7 • 

a — c h — g 

Then if g is taken arbitrarily (4) determines h and then (5) determines l. 

Case 1. Let all the roots be real and a<b<c<d. Then if g is real so is h. Take g = 1. 
Then taking the positive root 


h+1 
1 


>1, h> 1, 


( 6 ) 


R x (t) = C(< a -= (7 / (l-f*)(l-jfcV), 
and the transformation leads to t = A snw with k = 1/h < 1. 


( 7 ) 
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25*12 


Second and third elliptic integrals 

Case 2. Let b and c be real, a and d conjugate complexes. Take g = 1. Then 


9 + h = , 
g-h 

and h is purely imaginary = ij. Then 


R 1 (t) = C(l-t 2 )(l+jH 2 ), 


( 8 ) 

(9) 


and integration can be carried out by putting t = cnu, sdw, or dsw. 

Case 3. Let a and d be conjugate complexes, b and c also. Take g = i. h will be purely 
imaginary since (4) is again real; and 

Rtit) = C(l + t z )(l + vH 2 ). (10) 


Integration can be carried out by putting t = sc u or cs u. 

Integrals of the form jdt/^R^t) are called elliptic integrals of the first kind, and can be 
evaluated in terms of the functions sn, cn, dn. More complicated integrals involving the 
square root of a quartic can be reduced to the form 


J 


+/ 2 (f 2 ) ,, 

<mt) 9 


in) 


where and / 2 are rational functions. Then this can be broken up by partial fractions 
into terms of the types 

Jj. f* *'zr> r 

( 12 ) 



f t^dt 

f dt 

■jr m ’ 

JAW’ 

J (l + nt^^t) 


The first form gives an elementary function. For definiteness we take the case when 
R^t) = (l - t 2 ) (i __ £2ji) i n the second, if we put t = snu, we have 

u p = jsn 2p udu. 

But ~ (sn 2p ~ 3 Mcnwdn u) = (2^-3) sn 2 ^- 4 w cn 2 w dn 2 w - sn 2 ^- 2 w dn 2 w - k 2 sn 2 ^" 2 w cn 2 u 
du 

= a sn 2p u +/? sn 2p_2 w + y sn 23,_4 u, 

where a, ft, y are constants. Hence by successive reduction u p can be reduced to u and 
jsn 2 udu. The usual standard form is 

E(u) = = jj V(T^) dX ’ tl3) 

and is called the second elliptic integral. 

The second complete elliptic integral is 

E = E{K) = J dn 2 udu. ( 14 ) 

E(u) is not periodic; but the function 

Z(u) — E(u) — ^u ( 15 ) 

has period 2 K. It is called the elliptic Zeta function, not to be confused with the Riemann 
£ function 

m = £»-. 

n = 1 




25 * 13 - 25-14 


Complete elliptic integrals 
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The third form in (12) is called the third elliptic integral. It can be evaluated in terms of $ 
functions at the cost of a good deal of algebra; direct recourse to arithmetic is probably 
usually the best method. 

Integration of elliptic functions can always be carried out in terms of # functions. 
For the necessary transformations the special treatises should be consulted. 


25*13. Complete elliptic integrals. The complete elliptic integrals often arise by 
themselves as definite integrals. They are fully tabulated, but care should be taken in 
using the tables; k is sometimes denoted by sin a, sometimes by sin^a, the latter being a 
survival from the time when the pendulum was the only application of K. x is often 
denoted by sin^, <J(l — k 2 x 2 ) by A<f>. Approximations to K' and E’ when k is small are of 
interest. We have 


-/ 

-l 


V** d$ 
o A 7 ® 

11271 1 — k' sin<£ 

"aW 


where A' = (1 — &' 2 sin 2 ^) 1/a 


#+f 


1/27r k! sin^ 

AW 


# = I x + h, say. 


'.-o 


= log 2. 

/ 2 is integrable exactly; 

* ll27T k'd(cos<fi) 




^~.o 'J(k 2 + k' 2 cos 2 <f>) 
= log^~- = log| + 0(fc), 


= - j^log {&' cos <j) + *J(k 2 •+• k' 2 cos 2 0)}J 


o 


and 


K' = log- + 0(&). 

/•V2 it rViir 

Similarly if k is small E' = A'(<f>)d<f>= cos <pd<j> = 1, 

Jo Jo 

while 2? = I A {<f>)d<f> = %7T+0{k 2 ). 

25*14. Reduction of integrals containing a cubic. If 

C x dx 

* = JoVx’ = ^ x ~ a )^ b ~ x ) { < c - x )* 

we can put x — a = £ 2 , and then 

J{(6-»-£*)(c-a-£»)}** 

which is in one or other of the standard forms according to the signs oib — a and c — a. 
These forms are more generally useful than the Weierstrass one, which takes the standard 
function as 

$>(«) = A + ZS'L-i-75-7- 

u* {(u — moj — no )) 2 (ma)+na)) 2 ) 







688 Landen’s transformation 25*15 

This satisfies = 4$) 3 (u)—g 2 jp(u) — g z , 

where g 2 and g z are known functions of a), (o'. This function has the property that for u small 

P( u )~^2= °( w2 )- 

Had Abel been alive he would probably have remarked that this property corresponds to 
taking the fundamental trigonometric function as cosec 2 x — 

The square of any of sn, cn, dn and their reciprocals and ratios is a function with only 
double poles, and having the periods 2 K and 2 iK\ These functions can therefore be used 
in the same way as p(u) in Weierstrass’s method. 

25*15. Change of modulus: Landen’s transformation. This important trans¬ 
formation is most symmetrically expressed if we take the integral 


rt 

Jo {« 


d(f> 


i 2 cos 2 <fi + b 2 sin 2 0} 1/a 
where a ^ 6 > 0. We put 




du 


+ u 2 ) (a 2 + 6 2 w 2 )} 


(u = tan^), 


u 


then 




’ A — Bv 2> 

(A -f Bv 2 ) dv 


(1) 


( 2 ) 


( 3 ) 


I o [{{A - Bv 2 ) 2 + v 2 } {a 2 (A - Bv 2 ) 2 + 6 * 

With a suitable choice of A and B we can arrange that the second square root is pro¬ 
portional to the numerator; this is seen to impose only one condition, and we can add the 
further condition that the first factor in the denominator is to vanish when v 2 = — 1. 
The result will be a new integral of the same form as that in u. The conditions are 

(A + B) 2 = 1, 4 AB = b 2 ja 2 , (4) 

and A and B can be taken as \ ±\ Js j{\ — b 2 fa 2 ). Then 

dv 




and 

thus 

with 

or 

with 


{(1 + v 2 ) (a 2 A 2 + a 2 B 2 v 2 )} 112 ’ 
a 2 = (A + B) 2 a 2 , 6 2 = 4ABa 2 ; 


Pv 

Jo 


dv 


{(l+v 2 ) (a 2 + /? 2 v 2 )} 1/a 


v = 


a + /? 

2j3u 


rf 

Jo {a 


dijr 


2 cos 2 + fi 2 sin 2 

tan^ 


_ i J u 

IKS8H 

_ 2J0 {i(a 


[(1 -I- u 2 ) {\{cl+ fi) 2 + afiu 2 }] 11 * * 


d<f> 


4- /?) 2 cos 2 ^ -1- a/? sin 2 ^} 1 /® ’ 


(a+/3) tan^ 


( 5 ) 

( 6 ) 

( 7 ) 

(8) 

(9) 


a—/?tan 2 ijr ' 

lixlr = ^ 7 r,(j) — it, and we have a simple relation between the complete elliptic integrals. 
If we denote the integrals by t we have 

sin r/r = sn ja£ j — ~2 j J> sin^ = sn + ~+^j * 

with (9) holding between the corresponding sc functions. 


(10) 



25*15 Landen's transformation 689 

In (8) the constants a, /? are replaced by their arithmetic and geometric means. Thus 
if a = 0-9, fi = 0-1, the arithmetic and geometric means are 0*5 and 0-3; repeating the 
transformation we get 0*40 and 0*387. By successive applications of the transformation 
we can therefore reduce the elliptic integral as closely as we like to a linear function of 
the new argument. 

The complete integrals of the form (8) are symmetrical in a and but 

df _ rv* dxjr _ if/ _P 2 Y h 

Jo (a 2 cos 2 f+P 2 sin 2 f) 1 ^ J 0 {a 2 — (a 2 — P 2 ) sin 2 if} 11 * a \\ a 2 / 



in which the symmetry is no longer obvious. But it is also equal to 

a+fi [*+/})• 

in which the symmetry is obvious since K is an even function of k. 
The second elliptic integral can be treated similarly. We take 


( 12 ) 


J — j {i(a+/?) 2 cos 2 ^ + a/? sin 2 0} 1/2 d<f> 


= i(<*+/5) 2 J 

by the same transformation. We have 

d cos rjr sin if 
dif (a 2 cc 

and 


* (a cos 2 if+P sin 2 if) 2 


o (a 2 cos 2 if+P 2 sin 2 iff 1 * 


df, 


(13) 


_ a 2 cob* if— /? 2 sin 4 ^ 
dif (a 2 cos 2 if+P 2 sin 2 iff 1 * (a 2 cos 2 if+ft 2 sin 2 if ft* * 


(a cos 2 if -f ft sin 2 if) 2 = — (<* 2 cos 4 if—fi 2 sin 4 if) 

2(xQ 2 

+ W+pY cos2 sin2 ^ + {a+pf ^ cos2 sin2 ^ )2 ' 

The easiest way to find the coefficients is to insert a factor cos 2 if +sin 2 if in the second 
term on the right and then equate coefficients. Then 


J = -U* 2 -P 2 ) 


cos if sin if 


T 

"Jo 


(a 2 cos 2 ijr+pi* sin 2 ff l *_ 

. ft dif ft 

+ a/? Jo (14) 


which can be written 


(a 2 cos 2 if+P 2 sin 2 iff 1 * df = j* (|(a+ P) 2 cos 2 ^ + apam 2 tpfl* df> 

-lafff* _#_ . i f .i hk cosfsin 

5 ^Jo{K«+/») 2 <!08 2 5 6 + a / Ssin 2 56 } , A + 5(a P ’ (^W^+^sin 2 ^) 1 '.• (15) 


J MP 


44 
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Landen’s transformation 


25*15 


Equivalent relations were found by Landen in 1775 in determining the length of an arc of 
a hyperbola. 

For the complete integrals we have 


(a 2 cos 2 tjr -f /? 2 sin 2 di/r = 2 I {l(a, +ft) 2 cos 2 + sin 2 0} 1/a d<j> 


(16) 


o {£(a + /?) 2 cos 2 0 + a/?siii 2 0} 1/a * 

These relations were used by Legendre in his numerical calculation of the elliptic integrals. 


EXAMPLES 


1. Prove that 


sn \K = —{ja + k)-^! -fc)b 


and find what values of z make sn z equal to the four expressions obtained by reversing the signs of 
the roots. (Use cniT = 0.) 

2. Prove that sn fyiK' = i/Jk, 

and find what values of * make sn z = 

3. If M = J" 

express x as a single-valued function of u. 

4. Solve the pendulum equation 


dx 


,J(x*-lx*+lO)’ 


in terms of elliptic functions. 
6. Prove that 


ksnt = lim 

lim 

S 

m-»- oo »-> oo 

/* — 



m 

fccn< = lim 

lim 

E 

m->oo n->oo 

/*—' 



m 

dn$ = lim 

lim 

s 


s= -f ^ (sin 2 £0 O - sin 2 $0) 
l 




1 _ ) 


l !: 


2 K' 


2 K' 


i.{< 


2 K' 


2 K' 


I 


p = - mv ~- n \{t + ^K + ±viKT + K'^ {t+{±ii + 2)K + 4»iK')* + K>*\ 

and express the double series as series of trigonometric functions of 2Kt/n and of hyperbolic functions 
of 2 K't/n. 

6. Prove that k J* snudu = log^ 1 |^^(w —^')| +l°g^ |( M + j 

dnu- kcnu 

= 41 og-constant. 

dnw+Kcnti 



NOTES 


1*116 a. Theorem of bounded convergence. We need first a few definitions. For any 
finite set I of non-overlapping intervals we define II as the sum of their lengths. For an 
infinite set I within (a, 6) we define II as the upper bound of the sums of lengths of finite 
sets included in I. If E is a set of points of (a, 6) we define the complementary set CE as 
the set of points of (a, b) that are not members of E. The complementary set of a finite set 
of closed non-overlapping intervals is a finite set of non-overlapping intervals. If all 
points of E x are members of E 2 we write E x C E 2 (read, E x is included in E 2 ). 

Lemma 1. If lisa set of non-overlapping intervals included in (a, b), and 8 is any positive 
quantity, however small, there exists a finite set J of the intervals of I such that II^U >11 — 8. 
This follows at once from the fact that II is the upper bound of U for all finite sets. 

Lemma 2. In the same conditions there exists a finite set of closed intervals K included in 

I such that IK >11 — 28. For let the J of Lemma 1 be m in number. For any interval of J, 
say (<x is fit) such that fi i — ct i > 8jm, define an interval of K as the closed interval (a* + 812m, 
P i — 812m). Then the set of intervals K have the property required. 

I consists of K together with another set of non-overlapping intervals K' such that 

II = IK + IK'. 

Lemma 3. If {I n } is a sequence of sets of non-overlapping intervals included in (a, b) such 
that I n C I n _ x , and such that noxof (a, b) belongs to all I n , then lI n ‘-> 0. Given 8 > 0, we can 
take a finite closed set K n included in I n such that lK n > ll n — 2 ~ n 8. Denote the set of points 
contained in I n and not in K n by K' n . Take L n to be the set of points common to K x , K 2 , ..., 
K n . Then a point of I n is a point of I x , 4 -.4 and therefore is either a point of L n or of 
at least one of K[, ..., K' n ; and ll n < lL n + IK X +... + lK' n <lL n + 8. Hence L n is a finite 
set of non-overlapping closed intervals (and therefore closed) satisfying L n C L n _ x and 
lL n > ll n — 8. 

Since no point of (a, b) belongs to all I n , every point belongs to some CI n and therefore 
to some CL n . since GI n C CL n . It is not an end-point of any interval of this CL n because 
end-points of CL n are end-points of L n , which belong to L n since L n is closed. Hence for 
every point x of {a, b) there is an n such that x is interior to some interval of CL n . Consider 
the set of all such intervals. Then, by the Heine-Borel theorem, there is a finite set of 
intervals d x ,d 2 , ...,d k , each part of some CL n , that covers (a, b). Let d r be a member of CL^ 
and let n Q be the greatest of n x ,..., n k . Then since CL n C CL n+1 , GL no includes all d r and 
therefore CL no is the whole interval {a,b), and L n is empty for all n^n 0 . Hence lL n = 0, 
U n < 8 for all w n 0 . Since 8 is arbitrary, ll n -> 0. 

Lemma 4. If f n {x) is non-negative and integrable in {a, b), f n (x) < M for all x and n, and 
f n {x) -> 0 for all x, then J f n (x) dx 0. For every n define a subdivision such that the 

corresponding lower sum as in 1-101 differs from f f n (x) dx by less than 1 jn. Take g n {x) = 0 


44-2 
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at every point of subdivision, and inside any subinterval take g n (x) equal to the lower 
bound of/ n (:r) at points of that subinterval. Then 

f n ( x ) > 9n( x ) >0; 0<f {fjx) -gjx)}dx < 1 In; gjx) ->0. 

J a 

Given e > 0, denote by I n the set of all x where g p (x) > e for at least one p^n. Interior 
to any interval chosen forf p (x), g p (x) is constant; hence g p {x) > e throughout a subinterval 
if it is > e at any point. Thus I n is a set of non-overlapping intervals, and I n C I n _ v No x 
belongs to all I n , for if it did we should have f p (x) ^ g p (x) > e for an infinite sequence of 
values of p, and fjx) would not tend to 0. Hence I n satisfies the conditions of Lemma 3, 
and ll n 0. Take n 0 such that for n ^ n 0 , ll n < 8. Then for n^n 0 


j: 


s: 


g n (x) dx^e(b — a — 8) + M8, 


fJx)dx^e(b — a — 8) + MS+ ljn. 


But e, 8 and 1 jn can all be taken arbitrarily small. Hence the lemma is proved. 

Theorem. If f n {x) is integrable in (a, b), | f n (x) \ < M, and f n (x) ->f(x), where f (pc) is inte- 

grable, then f n (x)dx-> f(x)dx. Take h n (x) = \ f n (x) —f(x) \. Then h n (x) satisfies the 
J a J a 


rb 


conditions of Lemma 4 and therefore h n (x) dx-> 0. But 

J a 

I f fn( x ) dx ~ f f( X ) dx U f I fn( x ) ~f( x ) | d0C, 

I J a J a | J a 

rb rb 

whence f n (x)dx->\ f(x)dx. 

J a J a 

It will be noticed that f n (u) du -> f(u) du uniformly in (a, b). 

J a J a 

The above proof is one of three (somewhat overlapping) given to us by Professor 
Besicovitch. Dr Smithies has given an independent proof. 

As an illustration, take in (0, l) f n {x) = 0 for all irrational x. If x is rational and equal 
to m/k in its lowest terms, take f n (x) = 1 for n = k and otherwise = 0. Then f n (x) 0 
everywhere, and 

lim f n (x)dx = \ lim f n (x)dx. 

Jo Jo 

But in any interval of x, for any n, there are rationals whose denominators exceed n, and 
therefore f n (x) does not tend to 0 uniformly in any interval of x. The reason for using the 
lower sum, instead of, as usual, the upper sum, is that the upper sum leads to no definition 
of a set of intervals I n with the properties required. 

1*134 o. The most important applications are to integrals of the forms 


j: 


{e~ ix , cos tx, sin tx] v(x) dx (b > a), 
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when t is positive and large. Note first that if v(x) is given only to be bounded, say 
| v(x) | < A, we can assert (without using Abel’s lemma) that 

rb 


£ 


e - *® v(x) dx < Ae^jt. (1) 

l j 

c b 

Without further restrictions we cannot say that I (cos tx, sin tx}v{x)dx are 0(1 ft). But 

with a further condition on v(x) we can prove the latter statements, and we can find a 
result to replace (1) in the case where the integral is improper. 

<M in a < 6, then e - *® satisfies the conditions 


(a) If t is positive, and /: f(x) dx 

, a: 

II 


imposed on v(x) in Abel’s lemma, and we have 

rb 

e~ tx f(x) dx 


< Me~ ia . 


( 2 ) 


(b) If t is positive, and v(x) is non-negative, bounded, and non-increasing in a ^ x ^ b, 
then 


C s 

2 

rz 

cos txdx 

<5 7) 

sin txdx 

J a 

t 

J a 


2 


whence, by taking f(x) = coste or sin fee in Abel’s lemma, we have 

If 6 , 2 | rb 

jj cos txv(x)dx ^-v(a), J 8intxv(x)dx 


2 , x 
^-v(a). 


(3) 


(c) If t is positive, and v(x) has total variation V in a ^ x ^ 6, let its positive and negative 
variations in (a, x) be P(x) and N(x). Then 

v(x) = v(a)+P(x)—N(x) 

= v(a) + {N(b) - N (x)} - {P(b) - P(x)} + P(b) - N(b) 

= v(b) + {N(b)~ N(x)}-{P(b)-P(x)}. {4) 

Here v(b) is constant. N(b)-N(x), P(b)-P(x) satisfy the conditions imposed on v(x) in 
Abel’s lemma, and reduce to N(b), P(b) when x = a. Hence 

if 


cos txv(x)dx <?{| v(b) | + N(b) + P(b)} = |{| v(b) | + V }, 




sin tx v(x) dx 


< 1 {\Hb)\ + V}. 


(5) 

( 6 ) 


3*03 a. The statement that the only isotropic tensors of orders 2 and 3 are scalar 
multiples of 8 ik and e ikm respectively is easily proved by the method of 3-031. 


5*04 a. If 

f(x,y) = /(0, 0) = 0, 

df/dx = df/dy= 0 at (0, 0). Also if z = r cos d, y = r sin 6,f/r->0 when r^0 for any 
fixed 0. Hence df/dr = 0, and the gradient of / in any direction satisfies the vector rule. 
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But / is not differentiable, because for r < 8 we could always take y = x 2 , which makes 
fjr = \. Hence differentiability of a function is not a necessary condition for the gradient 
to be a vector as defined in 3-06. The property (5) is, however, so important that the vector 
property of the derivative tells us little if (5) is not satisfied. For instance, if / is 
differentiable, and r->0, in any manner, 

y )-/( 0, 0)}-*~coa</> + ^sin 

but this is false for the above example. 


5*07 a. A surface satisfying the conditions of 5-07 can be enclosed in an arbitrarily small 
volume. Since Z 2 + m 2 + n 2 = 1, at every point of the surface at least one of l 2 , m 2 , n 2 ^ 1/3. 
Take the points where n 2 ^ 1/3; these give a region or regions of x, y since n is continuous, 

and 1+F 2 X + F 2 = l/n 2 <3. 

Then F 2 ^2,F 2 <2. 

Hence | F(x + h,y) - F{x,y) | < h ^2, | F{x,y + k)-F(x,y)\^kf2. 

Take k^h. Then a parallelepiped of sides 2 h, 2k, 2h^2 centred at (x, y, z) will include all 
points of 8 where | £ - x | < h, \ 7 j-y\<k, and overlap similar parallelepipeds centred at 
points of 8 corresponding to adjacent points of the lattice; and such a set of parallelepipeds 
about points corresponding to all points of an (h, k) lattice in ( x, y) will therefore include the 
whole of S. Let the extents of x, y be H, K. Then the number of lattice points is 


(? +1 -! +1 )* 


and the total volume of the parallelepipeds is 

8 Wkj2 = 8j2(Hh + h*)(K+k) 

which tends to 0 with h since k^h. 

Apply a simila r argument to the points where Z 2 or m 2 1/3 and the result follows by 
addition. 

5*08a. Green derived the theorem known by his name* by separating the terms and 
integrating by parts. M. V. Ostrogradskyf gave the divergence theorem explicitly, but of 
course all principles used in it are included in Green’s argument. 

6*043a. Clearly if p satisfies the condition everywhere, and we altered it by a finite 
amount at an isolated point, <p and therefore V 2 <f> would not be altered; but Poisson’s 
equation would be false at this point. Integrability of p is therefore not a sufficient con¬ 
dition; these considerations suggest that continuity might be a necessary and sufficient 
condition. It is in fact necessary, but not sufficient. A weaker sufficient condition than 
the one we have assumed was given by Holder, namely that for any point Q different from 
p\ Pq - Pp \< Ar *, where A and a are fixed and a is positive. It is an extension to three 

* Collected papers, p. 23; Essay on the application of mathematical analysis to electricity and 

magnetism, Nottingham, 1828. 

t Mem. Acad. Imp. Sci. St Petersburg (6) 1, 1831, 130. 
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dimensions of the Lipschitz condition for one.* Continuity alone is not sufficient. If the 
density in a sphere of radius a is, in polar coordinates, 

3 cos 8 6—1 , 

P ~~ log (b/r) 5 >a * 

p is continuous but has unbounded derivatives near r = 0 and does not satisfy a Holder 
condition. It can then be shown that for 0 < r < a 

V 2 0 = — inyp 

has a solution containing a term 

0 O = f nyr 2 log log (b/r) (3 cos 2 6—1) 

but the second derivatives of 0 O do not exist at r — 0. Modified definitions ofV 2 0 have been 
proposed that make Poisson’s theorem true for all continuous p.f But even with the 
ordinary definition continuous distributions of p that make Poisson’s equation false are 
very rare. 

9*04 a . The throw-back can also be used with Bessel’s formula, as has been pointed out 
by Comrie. The coefficient of the fourth difference in this formula is ^(6+1) (6 —2) 
times that of the second, and this ratio varies from — £ to — the variation is even less 
than that of the corresponding ratio in Everett’s formula. The ratio of the coefficient of 
the fifth difference to that of the third is ^{6 + 1) (6 — 2). Consequently it is advantageous, 
if fourth and fifth differences cannot be neglected, to take 

/(*, + «*)=/„ + OSU + - 0-18 

+ g(g ~** (g ~ 1) - 0-1085 B /i/ a ). 

9*05 a. This method of transforming the equation is given by Newton in De Analyst , 
1669, and illustrated by the equation tc 3 — 2x— 5 = 0. Synthetic division is not used in 
Homer’s original paper he used another method of transformation. Synthetic division 
was introduced and applied to this problem by Homer in a further series of papers. § He 
did not multiply the roots by 10 at each stage. He emphasizes the importance of pro¬ 
ceeding one figure at a time in the early stages of the work, and this is perhaps his most 
important contribution. As it happened, the real root of the equation used by Newton for 
illustration is very close to +2*10. Consequently it is impossible to say from this example 
alone whether Newton habitually tried to obtain several figures at a time or not. He does 
so in the later stages of the work, when higher powers are becoming negligible, but so 
does Homer. 

It appears from Homer’s papers that Newton’s method had been completely forgotten. 
When he speaks of ‘Newton’s method’ he means the iterative method stated in geo¬ 
metrical form in the Principia (Lib. 1, Prop. 23) for solving 0 —esin0 = N (e, N given), 
and still usually known as Newton’s method. It seems to have been first applied to 

♦ O. D. Kellogg, Foundations of Potential Theory, 1929, 152—56. 

t H. Petrini, Acta Math. 31,1908,127-332; G. Birkhoff and L. Burton, Canadian J. Math. L., 1949, 
199-208. 

X Phil. Trans. 109, 1819, 306-36. § Leyboume’s Mathematical Repository, 5, 1820. 
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algebraic equations by Raphson. It contains no provision for making the determination 
of early figures facilitate that of later ones. For non-algebraic equations such provision 
is best made in the method of inverse interpolation. 

9*09 a. The comparison of the Simpson and three-eighths rules takes the total range 
the same for both, so that Simpson’s rule uses one intermediate value and the three-eighths 
rule two. If the lengths of the intervals are the same for both the advantage is the other 
way. Thus 

J x 4 dx = f x 243 = 97*2. 

Using unit intervals we have 

x -3-2-10 1 2 3 

x 4 81 16 1 0 1 16 81 

Simpson’s rule gives 3(162 + 4x32+2x2} = Jx 294 = 98-0. 

The three-eighths rule gives 

f(81 + 3 x 16 + 3 x l + 0) + f(0+3x 1 + 3 x 16 + 81) = f x 132x2 = 99-0. 

The possibility of this comparison arises only when the number of intervals is a multiple 
of 6 . 

9*09 b. These rules have been used with success by S. Chandrasekhar in the solution of 
integral equations.* When a method analogous to that of 4-17 is used, and 10 equally 
spaced values would be needed to give the accuracy needed, this accuracy may be 
achieved with 5 points suitably spaced. The smaller number of equations to be solved 
compensates for the inconvenience of interpolation. 

9*10 a. With any method of numerical solution of differential equations, rounding-off 
errors tend to accumulate, and as each is carried on to the next step they cannot be 
detected by differencing. This is particularly serious for a differential equation of the form 

y" =f(*)y> 

where f(x) is positive. One solution, y v increases with x, another, y 2 , decreases. Then the 
first two values of a solution y can be represented exactly by a function of the form 
Ay x + By 2 . If we start from 0 and try to compute y 2 , the first two values actually chosen 
will have rounding-off errors, which can be represented by a term in y x , and the latter 
will increase steadily throughout the work, while y 2 itself diminishes. Thus the pro¬ 
portional error will increase for both reasons. Consequently it is desirable, in solution of 
equations of this type, to work in the direction of increasing # if we want y v but in that of 
decreasing a: if we want y 2 . 

9*1 la. Since 

f(a + h) = f(a) + Vf(a + h), V n f(a) = (V n - V^ 1 ) f(a + h) 
we can write the Adams-Bashforth formula as 

I ['a+h 

^ f(x)dx=f(a + h)-Vf(a + h) + (^+^+...)(l-W)f(a + h) 

*/(a + A)-(iV+AV«+*V» + ^ r V«+ T | T V»...)/(o + A). 

In this formula the coefficients of the second and higher differences are much smaller than 

* Radiative Transfer, 1950, Chapter II. 
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in the original one, and it is correspondingly more accurate. As for the central-difference 
formulae, the procedure is first to extrapolate a value off(a+h) and then improve it by 
successive approximation. 

9*14a. The Gauss-Jackson method can be adapted to the solution of 



if we have a means of calculating dy/dx at the tabular values of a;. We have 

■ h/^ + -)y- 

Substituting y from (10) we find 

The coefficients are the same as in 9-084 (8). The extra trouble of forming 

(5- 1 - ^8 4-...) (f n _ i/ 2 , /„+1/ 2 ) 

and taking the mean is not prohibitive. 

The Euler-Maelaurin formula leads at once to the integration formula 

yi-yo = P{(«/o+ y'i) - Wyi+th) hS3 yl} + 0{W). 

This resembles the central-difference formulae for double integration in the small factor 
associated with the third term. Consequently, if y" is easily calculable, this formula can 
be used for solution of first-order equations as easily as the central-difference formulae 
for second-order equations with the first derivative absent.* 

9*16 a. Southwell’s method depends on the same principles as Seidel’s. Its distinctive 
features are: (1) at each stage a record is made of the outstanding residuals of all equa¬ 
tions; (2) the next step is to reduce {liquidate, in Southwell’s language) the largest residual; 

(3) no attempt is made to obtain more than one figure at a time in the next approximation. 
Thus, with the same equations as before, the largest term on the right is in the third 
equation. Take x = y = 0, z = +2 as the first approximation. The left sides are + 2-0, 

— 5*2, +11-4. Subtract these from the right sides, leaving the residuals +5-8, +2-9,— 2-8. 
The largest is the first; take x = 4-1 and proceed. The values given in later approximations 
are, of course, corrections to the approximations already found: 

Z = + 2 X = + 1 y = + \ *=+0-5 

6-3x-3-2y4-l-0z = +7-8 +2-0 +5-8 +6-3 -0-5 -3-2 +2-7 +3-15 -0-45 

— 3-2x 4- 8-4t/ — 2-6z = — 2-3 -5-2 +2-9 -3-2 +6-1 +8-4 -2-3 -1-6 -0-7 

+ l-0x — 2-6?/ + 5-7z = + 8-6 +11-4 -2-8 +1-0 -3-8 -2-6 -1-2 +0-5 -1-7 

y = -0-2 x — — 0* 1 z = — 0-1 y — -0-05 x = -0-04 

+ 0-64 -0-79 -0-63 -0-16 -0-10 -0-06 +0-16 -0-22 -0-25 +0-03 

-1-68 +0-20 +0-32 -0-12 +0-26 -0-38 -0-42 +0-04 +0-13 -0-09 

+ 0-52 -0-51 -0-10 -0-41 -0-57 +0-16 +0-13 +0-03 -0-04 +0-07 

y = -0-01 z = +0-01 y = +0-002 z =-0-001 

6-3x — S-2y + l-0z = 4-0-03 +0-03 0-00 4-0-01 -0-01 -0-01 0-00 -0-00 0-00 

-3-2x4-8-4?/-2-6z = -0-09 -0-08 -0-01 -0-03 +0-02 +0-02 4-0-00 +0-00 0-00 

+ l-0x-2-6y+5-7z = 4-0-07 +0-03 +0-04 4-0-06 -0-02 -0-01 -0-01 -0-01 0-00 

* D. R. Hartree, Proc. Camb. Phil. Soc. 46, 1960, 523-4. 


6-3x — Z'2y+ l-0z = —0-15 
-3-2x + 8-4?/-2-6z = -1-48 
+ l-0x - 2-6 y 4- 5-7z = + 0-01 


z = -0-3 

-0-30 -0-16 
+ 0-78 -1-48 
-1-71 4-0-01 
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The solution is 

x = +1-0 + 0-5 -0*1 -0-04 = +1-36, 
y = + 1-0 - 0-2 - 0-05 - 0-01 + 0*002 = + 0-742, 
z = + 2-0 - 0-3 - 0-1 + 0-01 - 0-001 = +1-609. 

It is, in general, worth while to overcorrect at each stage in this method (and in Seidel’s). 
If, for instance, we increase x in one approximation to remove the residual in the first 
equation exactly, then it will increase the residual of the second. This will be compensated 
by an increase in y. But this will again increase the residual of the first equation, and x will 
need a further increase. For this reason, especially if the non-diagonal coefficients are not 
small, convergence can be made more rapid by overcorrecting. 

If, for instance, the correction to x needed to remove the residual in the x equation at 
some stage is 8 X , and we actually increase x by anything between 8 X and 28 x) it is easy to 
see that we shall always decrease S. In the relaxation method applied to differential 
equations it is often worth while to take the correction as f 8 X or even f£ x . 

11*171 a. Another way of stating the theorem for m— 0 is: If f(z) is bounded and 
analytic in a neighbourhood of z = 0, except possibly at z — 0, then a function g(z) exists 
that is equal to f(z) except possibly at z = 0 and is analytic also at z = 0. For the Laurent 
expansion of/(z) has the property required. 

13*05 a. The statement that the failure of a moving liquid to turn a sharp comer 
smoothly is due to the formation of a negative pressure is still to be found in text-books 
of hydrodynamics. As Rayleigh pointed out many years ago, the same phenomenon 
occurs in a gas, in which there is no question of negative pressure.* 

13*091 a. If in Osgood’s function (11-18) we put z' = z+l/z we get a function of z’ 
bounded over the whole z’ plane, but not constant. This does not contradict Liouville’s 
theorem because there is a line of singularities from — 2 to 2. 

14*08 a. Lebesguef has given a direct proof of the theorem for polynomial approxi¬ 
mations, not depending on the use of integration. Other proofs independent of the use of 
Fourier series exist. One due to Weierstrass, applicable to any number of dimensions, is 
to take, for instance, 

g{x,y,z) = 

where f is continuous in. D. k can be taken large enough for | g —f | to be uniformly < <o, 
in any D' interior to D; and then by expanding the exponentials in D we get the required 
approximation. See also Courant and Hilbert, 1, 69-72; Littlewood, A Mathematician's 
Miscellany , 30-4. 

14*13 a. A method extensively recommended in recent years for the solution of linear 
differential equations is (1) to apply the Laplace transformation to the whole of the equa¬ 
tion, (2) hence determine theLaplace transform of the solution, (3) identify this by reference 
to a list of Laplace transforms of known functions. In criticism of (1), without pre¬ 
liminary study of the properties of solutions of the equation in general, there is no reason 

* See also H. Jeffreys, Proc. Roy. Soc. A, 128, 1930, 376-93. 
t Bull, des Soi. Math. 22, part 1, 1898, 278-87. 


jj'j^ exp [ - &{(£ - *) a + (i; - y)* + (£- «) 2 }] dgdri 
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to suppose that the Laplace transform exists; and of (3), that the Bromwich integral 
actually gives the answer whether the function has already appeared in the various lists 
or not. Even if it has, a special theorem is still needed to establish uniqueness of the 
solution. It is remarkable that such treatment is advocated even for tuute sets of ordinary 
differential equations, for which the direct operational treatment leads to a straight¬ 
forward proof of existence and uniqueness of the solution. For partial differential equations 
substitution of complex integrals in Bromwich’s manner proves existence; uniqueness 
would require use of the general theory of such equations. The Laplace transform method 
proves neither.* 

If a solution is wanted for 0 ^ t ^ T, it is possible to apply the Laplace transform method 
wring integration from 0 to T’>T instead of from 0 to oo. The resulting transform natur¬ 
ally depends on T', but its interpretation by the Bromwich integral can be proved without 
much difficulty to be the same for all £ < T so long as T' ^ T. 

17*07 a . Ai(z) and Bi(z) are tabulated in the British Association Mathematical Tables. 

For information about existing tables of numerical values of functions the Index of 
Mathematical Tables by A. Fletcher, J. C. P. Miller and L. Rosenhead (1946) should be 
consulted. 

18*02 a. The gradient of a scalar function <j> in general orthogonal coordinates has 
components, in the directions of £ lf § 2 , £3 increasing, 

d(j> d<f> __ 00 

Ul = hM’ = “ s- v5' 

The divergence of a general vector function u is found by considering the flux out of an 
element as in 18-02; it is 


divw = 


vha {Wi {h ‘ h ° Ul)+ Wi (^i « 2 )+! 3 (Ma)} • 


The curl of a general vector is found by considering the integral of its normal component 
over a surface of constant with £ 2 constant over one pair of edges and £ 8 constant over 
the other pair; the values of £ 2 , £ 3 over opposite edges differ by ££ 2 , ££ 3 . Then by Stokes’s 
theorem this integral is ju^dx^ around the element. Expressing this in terms of the 
components and taking $| 2 , small we find 


1 r 0 0 "i 

(ourlM)l = Kh}[M, {K,H) ~Wj Kud \' 

, Cylindrical coordinates (ur, A,z): 

[dm* rz70A’ dzj* 

0Wi 


10/ v 0W A du z 

divw = — (ruu m ) H —~ + — 

mdvj m w 0A oz 

, , 4 1 du. du x 

. x du„ du. 

(curl *)*—s-gj. 

, _ . 1 3 . . du _ 

(curl W) a = — 3— ~ —oT • 


* See also H. Jeffreys, Proc. Inst. Elec. Eng. (Heaviside Centenary Volume, 1950). 
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Spherical polar coordinates (r, 6, A): 

g™ 1 ? 0r > rd g> rs intfaA’ 

1(0 0 0 1 

diT “ “ (*■»*)]. 

(curlM) ' = ^(l (sin ^ ) -^)- 
(ourltt)e = _ l (rsine “ ; ' ) }’ 

(curi«K = ig(^)-^j. 

Components of strain in these coordinates are given in Love’s Elasticity, 1905, p. 56; 
but his ^ are the present 1/^; and his components e 23 , e 81 , e 12 are twice those adopted here. 

18*05 a. There are many papers on Mathieu’s and related equations, especially by 
E. L. Ince and S. Goldstein, who first produced adequate general methods for computa¬ 
tion. References are given by W. G. Bickley.* The fullest account is by N. W. McLachlan.f 

23*08a. Wave problems. Transmission and reflexion. Solutions of 

g+AV-cV = 0 (1) 

are exp (— fyihz 2 ) ( U, V) (a, •£, ihz 2 ), a = £(1 — ihc 2 )„ (2) 

If the time factor for y > 0 is eW, U exp (iyt — \ihz 2 ) represents an advancing wave for 
z > 0. For z < 0 this solution becomes 

exp (— \ihz 2 ) U (a, ihz 2 e 2in ). (3) 

From 23*04 (25) and 23*05 (23) we find that for | arga; | < n 

2jrie~^ 7r 

U{xe 2i7t ) = —— jyy ^ V(x) + (2 e^v” cos jtt — e 2 ^*-^) JJ(x), (4) 

from which we find that (3) is equal to 

[ 2mV—”1 

(a — 1)! — Vihz2 }~ eUa ~ i)n u ( a > h ♦***)J (5) 

- ex P(-[ ( g _ ifTfi _ jj i F+^p], (6) 

The moduli of the coefficients are found to be (1 + e 7 ** 71 )*, e ihc * n . The term in V represents 
the incident wave for z < 0, since its phase decreases with | z |. Thus the amplitudes of the 
incident, reflected and transmitted waves are in the ratiosj 

(l +e fccV)i, e ihc*7T } i 

* Phil. Mag. (7), 30, 1940, 310-22. 

■j* Theory and Application of Mathieu Functions, Oxford, 1947. 

J Dr J. Heading gave us this result, found by a different method. 
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25*031 a. When x goes from — 1 to 1 by any path in the upper half-plane, 9 l(t) goes 
from —K to K; hence every value of 91(0 hi this interval specifies a curve going from 
a point between — 1 and 1 on the real axis to infinity. But on this curve $(f) goes from 
0 to K'. Hence for any t satisfying —K< 91(<) ^ K, 0^$(£) < K', there is a value of x 
specified by a path from 0 in the upper half-plane. Similarly if 

- K < m) <K, -K'< 3(0 < 0, 

there is a suitable path in the lower half-plane. Hence, by including suitable numbers of 
circuits about the paths C and C', we can make t take any assigned finite value. This is 
the important property of ubiquity : that the integral can take any value of t and hence 
that its inversion defintes x over the whole t plane. 





APPENDIX ON NOTATION 


The difficulty of learning mathematical physics is much increased by confusion 
of notation, especially the overworking of certain letters and the introduction of 
awkward sign conventions. The only criterion usually recognized is conformity with 
‘standard practice*. Unfortunately standard practice is not unique and students are 
put to much unnecessary trouble by having to accustom themselves to work with 
different conventions in rapid succession. Research workers in border-line subjects 
are also inconvenienced by finding the usual symbols in one subject pre-empted for 
different meanings in another. 

The following principles are important in choosing conventions: 

(1) Complications should be reduced to a minimum. Negative signs should not be 
introduced without good reason. 

(2) Genuine physical differences should be recognized as such and not disguised as 
conventions; attempts to disguise them always lead to later difficulties that should have 
been forestalled. 

(3) Where a mathematical theory has applications in several subjects the notation 
should be such that it can be carried over into those subjects unchanged; so far as possible 
it should not use symbols already used with other meanings in those subjects. 

The outstanding difficulty of notation at present is the ambiguous use of V and <j>. 
V is used for potential energy, which is a property of a complete system, but also for the 
various potential functions, which are functions of position within the system. It is also 
used for Hamilton’s characteristic function, and, in hydrodynamics, for a component of 
the velocity at a great distance. There is a tendency at present for potential functions to 
be denoted by (f >, a usage long established in hydrodynamics. This would remove most of 
the difficulty. The characteristic function, which has a rather special field of application, 
could be denoted by A, and the velocity component by U 2 if tensor notation is adopted. 
It would therefore be easy to remove the ambiguity of V. The trouble is now that <j> is 
also used for one spherical polar coordinate, corresponding to the longitude, and there are 
many potential problems that require this coordinate. One alternative would be to use 
capital O for potential in such problems, but this is difficult to write. The other is to find 
a new symbol for longitude and, with it, Euler’s second angle. Lamb here uses o) in pro¬ 
blems where a velocity potential exists and <$> where none does. The disadvantage of o) 
as a regular notation for this purpose is obvious. <f> has the disadvantage that in geodesy, 
which also depends greatly on the theory of the gravitational potential, <f> is used for the 
latitude, and the longitude is denoted by A. The same notation is used in meteorology. 
Again, in classical hydrodynamics we often require to use the velocity potential and the 
gravitational potential in the same problem, and it is therefore impossible to use $ for 
both. If we maintain its original use for the velocity potential, therefore, we must for this 
reason alone find another symbol for the gravitational potential. A universal rule that 
potential functions are to be denoted by (j) is out of the question. The exceptional treatment 
of the gravitational potential would remove the difficulty in geodesy and dynamical 
meteorology, where velocity potentials do not occur, but not that of electrical and hydro- 
dynamical systems with axial symmetry, where we need another symbol for the azimuthal 
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coordinate anyhow. The geodetic practice suggests A as a suitable one. Something could 
be said for \jr\ its use as the allied function to $ in two-dimensional theory would lead to 
no confusion because the corresponding angle is there usually denoted by 0, and ijr was 
used in the works of Routh for the second Eulerian angle (x would be available for the 
third). The replacement of ^by <j> was made only in recent works. Lamb’s Higher Mechanics 
uses 0, i]r, X' Either change would lead to changes in the notation of spherical polar co¬ 
ordinates, but there seems to be no escape, and in fact the notation given in mathematical 
textbooks is not used in a large fraction, perhaps the majority, of problems where position 
on a sphere has to be specified. 

In dynamics it is generally convenient to use the work function, the work done on a 
system in transporting it from some standard state to its state at time t. This avoids a 
negative sign in the formation of generalized force-components. The potential energy, 
which is the work function with its sign changed, is convenient in the general theory of 
small oscillations about equilibrium, since it is then a positive quadratic form. This is a 
case where a little extra complication in notation is justified. The interest of potential 
energy in its own right really arises in relation to stable systems and becomes dominant 
in electricity and thermodynamics. In the treatment of large motions it is a nuisance. 
There is therefore a definite advantage in having both the work function and the potential 
energy as part of our equipment. In electrical problems potential energy is often denoted 
by W when V has been preoccupied by the potential function; but if we denote the latter 
by <j> we can use V for the potential energy and release W. The position therefore is that 
we need symbols for work-function and gravitational potential, and U and W are available. 
U is at present widely used for both. It is suggested that W should be used for the work- 
function and U for the gravitational potential. 

With regard to the choice of signs, the following usages have become common, for the 
sake of formal similarity: 


Generalized force: 

Qr = 

dV 

Electric intensity: 

*< = ■ 

d4> 

dx t * 

Velocity in fluid: 

Ui = 

d<j> 

dx t * 

Gravitational acceleration: 

%i = 

0ft 

dXi 


(1) 

( 2 ) 

(3) 

W 


The first arises if we use potential energy; the sign is reversed if we use the work function, 
which is, for instance, the easier in all problems of orbits. The second usage has definite 
recommendations. In electrostatics the potential energy is a minimum and is convenient 
to write down, since that of two charges is ee'/r in electrostatic units. We must either have 
the negative sign in the function to be differentiated or insert it after the differentiation. 
The usual potential is the change of potential energy per unit change of charge at the point 
considered. If we used the work function per unit charge instead we should have to put a 
negative sign into the definition of <p, which would mean reversing the signs of all recorded 
potential differences. Accordingly (2) must be kept. (4) was introduced by Lamb to make 
it analogous with (2). But this is a false analogy. There is a fundamental physical differ- 




704 Notation 

ence between gravitation and electrostatics: two masses attract, two like charges repel, 
and a difference of sign somewhere is inevitable. What Lamb’s convention does is to make 
the gravitational potential always negative and reverse the sign in Poisson’s equation, 
a heavy price to pay for a thin analogy. The obvious course here is to call the work 
function per unit mass U , and replace (4) by the form in use before Lamb’s convention 

dU 

The negative sign in (3) was also introduced by Lamb. It has never been used outside 
Britain, but several other British writers have copied it on the basis either of Lamb’s 
authority or of a belief that the usage was general in this country. The latter belief is 
mistaken; the chief users of the velocity potential in this country are the workers on 
aerofoil theory, who have continued to use the positive sign as in Glauert’s book. Further, 
Lamb’s book is as generally recognized as the chief authority abroad as here, but his 
convention is not adopted; and Love’s Elasticity, an equally authoritative work, uses the 
positive sign when irrotational displacements occur. The negative sign in (3) can therefore 
be regarded only as an annoying and useless complication. Accordingly all considerations 
of convenient expression indicate that the best relations to take are: 


Q r = ^5- (general case) 

dV 

= — — (small oscillations), 

U') 

X 

Xi dx t * 


(2') 

dd> 

Ui ~ dx t ’ 


(3') 

.. dU 


(4') 


X 


The difference of sign in (2') and (3') is of little importance because electrostatic and 
hydrodynamical aspects seldom occur in the same problem. Gravitation is often important 
in hydrodynamics and it is convenient to have the same sign in (3') and (4'). 

A few words are also desirable about the constant y. This has different values in different 
subjects and special symbols may be introduced. It is desirable not to suppress it in 
electrostatics and magnetism even though its numerical measure has been made 1 by 
a choice of units. The choice of a unit does not make it into a number, and inadequate 
analysis of the nature of physical measurement has led to the assertion, which still appears 
in textbooks intended for mathematical students, that the ratio of the electrostatic and 
electromagnetic units is the velocity of light. Further, the theoretical absolute electro¬ 
magnetic units are never used, and with the practical units these constants are far from 
having numerical measure 1. The omission of the constants is possible only if the 
teaching of electromagnetic theory is completely separated from application to concrete 
systems. 

In Tisserand’s Mecanique Celeste, and in many other works on the subject where the 
law of gravitation is most used, the constant is denoted by / (roman or italic). / appears to 
be quite free from objection. It has been suggested that it might be mistaken for a 
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function, but one of us has been using it since 1914 without finding a case where some 
letter other than / was not indicated for any function considered. The alternative 0 seems 
to lead to more difficulty in avoiding ambiguity. 

The choice of units recommended by Lorentz and Heaviside, so as to absorb the 4ar 
of Poisson’s equation into y, seems to have about as much to be said for it as against it. 
The theorem of Green’s equivalent stratum asserts nothing but relations between values 
of <p, and contains the 47r, and there is no way of removing it from one place without 
putting it in in another. 

Several different notations are used to indicate that two quantities are not very different: 
0, ===, and = are all used. The first two have precisely defined mathematical senses. But 
in physics we often want to say, without detailed calculation, that two quantities are 
unlikely to differ by more than a factor of 10, or by more than, say, 10 per cent. We suggest 
that the former statement, usually read as ‘a and b are of the same order of magnitude’, 
should be expressed by £ a =£= b ’; and that the second, which can be read ‘ a is a rough estimate 
of b ’ or ‘a is approximately equal to 6 ’ can be denoted by ‘ a = b ’, meaning something more 
precise than a^=b and something less precise than a statement of extreme possible values 
of the difference. The degree of approximation that is interesting naturally depends on the 
problem. A statistical estimate a = b±cr has a meaning defined in works on probability 
theory and needs no change. 

The expression ‘very approximately’ literally means ‘very closely’; its use to mean 
‘very roughly’ is to be condemned. 
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Moore, E. H., 48 
Mordell, L. J., 275, 469 
Morera’s theorem, 371 
Morris, R. M., 425 
Motion, under gravity, 77 
of charged particle, 78 
Motz, H., 312 
Multiple integrals, 180 

change of variables in, 182 
Multiple expansions, 660 
Mumaghan, F. D., 105 
Mutual induction, 245 

Natural boundary, 369 

Necessary condition, 10 

Neighbourhood, 172, 342 

Nests of intervals, 6 

Neumann, C., 591 

Neville, E. H., 667, 675 

Newman, M. H. A., 22, 175, 347 

Newton, I., 203, 275, 695; see also Interpolation 

Nicholson, J. W., 594 

Nielsen, 576 

Non-concentric circles, 418 
Non-holonomic systems, 322 
Normal coordinates, 145 
Normal modes, 138 
Notation, 203, 702 
Null vector, 64 
Numbers, 1 
complex, 333 
real, 5 
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O ( x ), o (x), 24 

Oblate spheroidal coordinates, 539 
Oblique axes, 153 
Ocean, heating of, 572 
Offord, A. C., 431 
Open interval, 19 
region, 174 

Operational methods, 228, 388, 473, 548, 585, 
695 

Operators, composition, 231 
division by, 233 

interpretation, 229, 233, 237, 392 
limits of, 390 
series of, 230 
Orbits, 668 
Ordering relation, 2 
Orders of magnitude, 24, 705 
Orthogonal property of normal modes, 139, 142 
Orthogonality relations, 541, 636 
Oscillation (of function), 22 
Oscillations, small, 144, 252 
Oscillatory, 11 
Osgood, W. F., 369, 698 
Osgood-Vitali theorom, 372, 396 
Ostrogradsky, M. V., 694 
Outer product, 115 

p (Heaviside symbol), 237, 398 

Pi, 632 

Pairman, E., 466 

Parabolic cylinder coordinates, 535, 620 
Parabolic equations, 531 
Parallax, 111 
Parallelogram law, 58 
Parapet function, 405 
Parsoval’s theorem, 448, 457, 638 
Partial differential equations, types of, 531 
Partial fraction rule, 237, 389 
Pathology, 5, 17 
Pauli, W., 151 
Peano, G., 173, 229 
Pearson, K., 261, 269 
Pellew, A., 444 
Pendulum, 668 
Pendulum, inverted, 488 
Periodic disturbance at internal point, 657 
Periodicity, of solutions of differential equation, 
667 

empirical, 450 
Periodogram, 450 
Petrini, H., 695 
Physical magnitudes, 3, 57 
Picard, E„ 229, 367, 476 
Planetary orbits, 329, 494, 668 
Poinear6, H., 221, 499 
Poisson, S. D., 405, 437 
Poisson’s equation, 204, 211, 695 
Poisson’s integral equation, 405 
Poisson’s ratio, 102 
Pol, B. van der, 459, 491, 581 
Poles, 356, 358, 368 
Pollard, S., 26, 30, 338, 347 
Polygonal boundaries, 424 
Potential, 199, 412, 453, 703 
at external points, 543, 635, 641 
in cavity, 210, 629 
in polarized medium, 222 


Index 

inside continuous distribution, 208 
of disk, 641 
of line density, 205 
of non-uniform sphere, 642 
of sphere, 204 
of spherical cap, 204 
of spherical shell, 203 
theory, 199 
Power series, 349, 476 
differentiation, 352 
integration, 352 
multiplication, 353 
Powers, non-integral, 355, 359 
Pressure, fluid, 103 
Principal axes, 93 
Principal part at pole, 356 
Principal value of argument, 341 
of integral, 376 
of logarithm, 359 
Principia Mathematica, 2 
Principle of superposition, 239, 404 
Probabilities in chains, 163 
Progressive wave, 548 
Prolate spheroidal coordinates, 540 
Proudman, J., 450 
Pulse, 561, 595 
refraction of, 519 

?*, 579, 654, 656 
Quadratic forms, 136 
Quantum theory, 133, 152, 167, 202 
Quaternions, 74, 169 

Radioactivity, 256 
Radius of convergence, 350 
Raphson, J., 696 
Rayleigh, 253, 526, 553, 698 
Rayloigh-Ritz method, 218 
Rayleigh’s principle, 144, 148, 302 
Real numbers, 5 
Reciprocal lattice, 155 
Rectifiable curves, 172, 188 
Reductio ad absurdum, 8 
Refraction of pulse, 519 
Region, 172, 174 

multiply-connected, 175, 348 
Regular function, 339 
singularity, 478 

Relaxation methods, 305, 307, 697 
Removable discontinuity, 18, 357 
Residue, 359 
Resonance, 251 

Response of recording instrument, 460 

Reymond, P. du Bois, 28, 52 

Richardson, L. F., 219, 288, 306, 312 

Riemann, G. F. B., 26, 332, 345, 363 

Riemann’s lemma, 431 

Riemann-Weber, 567 

Riesz, F., 488 

Rigidity, 102 

Rimington, E. C., 246 

Ritz, W., 219, 304 

Robson, A., 73 

Rolle’s theorem, 49 

Rolling, 322 

Rosenhead, L., 667, 699 

Rotating axes, 106 


Index 
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Rotation, la,rge, 62, 96, 122, 149 
small, 79 

Rouche’s theorem, 378 
Routh, E. J., 142, 145, 254, 330 
Russell, B., 2 
Rybner, J., 251 

Saddle-point, 504 
Sadler, D. H., 291 
Sawtooth function, 405 
Scalar product, 65 
Scalars, 61 
Schlafli, 574, 649 
Schlicht functions, 380 
Schmidt, E., 168, 543 
Schwarz, H. A., 54, 190, 192, 366 
Schwarz’s inequality, 54 
Schwarz-Christoffel transformation, 426 
Sehrodinger’s equation, 300, 331, 488, 618 
Scientific laws, 57 
Sectionally continuous, 19 
Second mean-value theorem, 52, 692 
Secular instability, 256 
Sedgwick, W. F., 259 
Seidel, P. L. von, 305 
Seismograph, 248 
Self-induction, 246 
Separation of variables, 530 
Sequences, 10 
Series, 14 
double, 16 

expressed by integrals, 491 
Sets, 9, 172 
Sgn (z), 341, 399, 506 
Shook, C. A., 329 
si ( x ), Si ( x ), 471 
Simple discontinuity, 18 
Simple functions, 380 
Simpson’s rule, 286, 696 
Sine transform, 457 

Singular points of differential equation, see Differen¬ 
tial equations 
Singularities, 355 
essential, 357, 367 
at infinity, 357 
lines of, 357, 369 

Singularities of inverse functions, 381 
Small oscillations, 144, 252 
Smithies, F., 168, 543, 692 
Soddy, F., 258 
Solenoidal vector, 196 
Solid harmonics, 630 

Solution of algebraic equations, numerical, 274 
Homer, 275 
linear, 304 
Newton, 275, 695 
Raphson, 696 

Southwell, R. V., 218, 307, 312, 444, 697 

Sphere, Green’s function for, 220, 634 

Spherical harmonics, see Legendre functions 

Spherical polar coordinates, 537, 699 

Spherical waves, 559, 659 

Spheroidal coordinates, 537 

Spin matrices, 151 

Stability, 144, 255, 488 

Staircase function, 405 

Standing waves, 549 


Stationary phase, method of, 506 

Steepest descents, method of, 503, 625 

Stieltjes, 26, 467 

Stieltjes integral, 26, 30, 32, 239 

Stirling’s formula, 466, 507 

Stokes, G. G., 40, 221, 507, 511, 561, 645 

Stokes’s theorem, 195 

Stolz, O., 178, 366 

Stoneley, R. 253, 519, 616 

Strain, 97, 99 

Stream function, 412 

Stress, 99 

String, vibrating, 546 
loaded, 553 
numorous loads, 597 
Stroud, W., 4 

Sturm-Liouville method, 543 
Submarine cable, 602 
Substitution tensor, 59 
Sufficient condition, 10 
Suffix notation, 58 
Summation convention, 59 
Sunspots, 451 

Superposition, principle of, 239, 404 
Surface density, 204, 212 
Surface integrals, 188 

Kit 

Tannery, J., and Molk, J., 667 
Tannery’s theorem, 48 
Tauber, A., 436 
Taylor, G. I., 531 

Taylor’s theorem, 50, 266, 354, 361, 362 
Temple, G., 144, 304, 495 
Tensors, 86 

antisymmetrical, 91, 97 
isotropic, 87, 691 
quotient rule, 89 
symmetrical, 91, 97 
in two dimensions, 110 
Termini, 26 

Terrestrial magnetism, 639 
Tesseral harmonics, 633 
Thermodynamics, 563 
Theta functions, 680 
Thompson, A. J., 273 
Three-eighths rule, 287, 696 
Throwback, 270, 695 
Tidal equation, 485 
Tides, prediction of, 450 
Tisserand, F., 704 

Titchmarsh, E. C., 29, 368, 453, 522 

Top, 110, 146, 324 

Topology, 176 

Toroidal coordinates, 666 

Transformation of rectangular coordinates, 59 

Transformation theory (dynamics), 328 

Trigonometrical functions, 384 

Triple products of vectors, 74 

Turnbull, H. W., 267 

Turner, H. H., 253, 430 

Unbounded, 11 
Uniformity of continuity, 23 

of convergence (series), 37, 48, 351, 371 
(integrals), 44, 48 
Uniqueness theorems, 215 
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Unit function, 18, 243, 393 
Unit vector, 64 
Units, 3, 4, 704 

Variation of function, 24 
Variation of parameters, 493, 551 
Variations, calculus of, 314 
Vector, 60 
area, 69 

potential, 224.. 660 

of point charge and doublet, 226 
product, 67, 92 
single letter notation, 62 
of tensor, 92 
Vector diagram, 245 
Velocity potential, 412 
Vibrating string, 546 
Viscosity, 103 
Vitali, G., 372 
Vortex, line, 206, 604 

Waerden, B. L. van der, 660 
Wagner, K. W., 392 
Wallis, J., 245, 341, 468 
Waring, E., 261 

Watson, G. N., 525, 576, 579, 658 
Watson’s lemma, 501 
Wave equation, 529, 546 
Wave mechanics, 331, 527 
harmonic oscillator, 621 
hydrogen-like atom, 618 
hydrogen molecular ion, 488 
potential barrier, 527, 700 
rotation of axes, 647 


Wave velocity, 512 
Weatherburn, C. E., 84 
Webb, H. A., 616 
Weber, H., 1 

Webster, A. G., 466, 495, 531, 541 

Weddle’s rule, 287 

Weiorstrass, 10, 20, 41, 142, 363 

Weierstrass’s approximation, 446, 698 

Weiorstrass’s elliptic functions, 687 

Wessel, C., 245, 341 

Whipple, F. J. W., 323 

Whitehead, A. N., 2 

Whittaker, E. T., 461, 532, 616 

Whittaker and Robinson, G., 286 

Whittaker and Watson, 481, 488, 532, 681 

Widder, D. V., 26 

Wiechert, E., 248, 332 

Wirtingcr, W., 279 

Wood-Anderson seismograph, 248 

Work function, 321, 703 

Wronskian, 492, 587 

Y n (*), 576 
Young, G. C., 53 

Young, W. H., 21, 53, 178, 366, 431 
Young’s modulus, 102 

Zeta function (elliptic), 686 
(Riemann), 15 
Zobel, 0. J., 571 
Zonal harmonics, 633 
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